Welcome to my personal notes!

music gen from relative attention is really inconsistent

half the time it sounds amazing, half the time it plays 400 notes in 3s


last day at internship

i need more compute

didn't realize that lambda labs is constantly out of GPUs


i need a way to figure out if the model is bad because it is a bad architecture or just if the model is too small

August 2, 2023

maybe I should do electrical or computer engineering

i need more hard science knowledge

maybe ill just take a bunch of physics/chem classes for fun

i have basically no understanding of electricity/energy

https://ai.meta.com/blog/audiocraft-musicgen-audiogen-encodec-generative-ai-audio/

finished virtual assistant project yesterday

want to add visual aspect though(~wav2lip)


want to do something with tweet data

something like this:

https://github.com/effectiveaccelerationism/text-to-banger

visualizations are always fun to make

August 1, 2023

still think midi is the way to go for music

actual audio files don't have enough structure


was going to train model on cloud GPUs, but lambda had zero left

they are supposed to be the cheapest by far

still ~$2/hr though


todos:

  • set up whisper/mic
  • read Beginning of Infinity
  • look into img2talking models
  • https://arxiv.org/pdf/2205.06421.pdf

    July 31, 2023

    https://github.com/facebookresearch/audiocrafthttps://karpathy.medium.com/software-2-0-a64152b37c35

    July 30, 2023

    fine tunining did not work

    base model probably massively overfit

    todos:

  • add more data to training set
  • measure test loss
  • implement relative attention
  • relative attention is implemented

    going to try using cloud gpu tomorrow


    also found a good tts model

    July 29, 2023

    finally understand relative self attention

    gonna try to implement that correctly this time

    so far though, generation is sounding pretty great

    fine tuning rn, will see if that helps at all


    really wish there was linear algebra in high school


    sometimes it feels like all of my opinions are just conglomerations of things i've heard

    i know that is how learning works, but sometimes it still feels fraudulent

    July 28, 2023

    model loss is still not plateauing

    a little scared that is is going to overfit


    paused training to listen to generation, output was not great 😔

    not really sure why though, since loss is lowest its ever been, and the don't remember changing loss function before i started training


    i wonder how many pg essays i can read in one workday

  • Mind the Gap

  • gonna start grinding leetcode

    want to do ~2 problems a day


  • Undergraduation
  • The Lesson to Unlearn

  • fun side project could be tik tok generator

  • text content with gpt4
  • voice with eleven labs
  • video could just be b-roll/subway surfers/minecraft parkour
  • need to add subtitles and effects
  • upload to tik tok via api

  • use moviepy for video editing
  • need to find way to know when to display correct subtitles
  • could replace pinkydoll

    https://github.com/yerfor/GeneFace
  • What I Worked On
  • friday in the 4HL, can really feel the pull of the weekend

    July 27, 2023

    > woke up this morning

    > loss has not even come close to plateauing

    > LETS GOOOOO

    hopefully it doesn't take too long though


    need to learn what a superconductor actually is

    also need to set up SDXL

    July 26, 2023

    tried using relative attention last night, but didn't seem to work

    going to re implement by myself (still in torch)

    current implementation has too much abstraction

    http://blog.ezyang.com/2019/05/pytorch-internals/

    first embedding layer might be bottleneck

    weight initialization is probably wrong


    how come no one told me about gradient checkpointing

    now i have infinite GPU memory

    250M PARAMS😲

    450M PARAMS

    GIVE ME MORE

    July 25, 2023

    trained overnight for ~8 hours

  • generation seems okay
  • notes kinda make sense but are boring
  • time is relatively constant
  • no notes are played at the same time(chords)
  • definitely getting somewhere though


    making block size larger might have an effect on some of these

    also gonna look into relative attention like in original paper

    > generally pretty solid


    I am dumb as hell

    all I had to do was sample from the distribution, instead of taking most likely event at inference

    sounds 1000x better

    July 24, 2023

    stuff to fix:

  • weigh losses differently
  • midi2tensor timing seems off
  • change midi2tensor so that only one on OR off per event
  • bigger model? may not be necessary
  • July 23, 2023

    Oppenheimer was so good

    Barbie was a big letdown


    finished the model

    basically is just Karpathy’s minGPT with a custom loss function

    and inference is different obviously

    rn I pretty much have the biggest model I can run without out of memory error on gpu

    im sure I could figure a solution out though if needed (to get a bigger model)

    July 21, 2023

    todo:

  • vectorize MIDI files and shorten to multiples of input length
  • combine data to a single file
  • design model architecture
  • implement in torch
  • train
  • ???
  • profit
  • today's goal is actually just to train one iteration

    also want to read a bunch of Exit, Voice, and Loyalty – didn't read at all this morning


    maybe after piano music, try to recreate sounds of "MoogTube" playlist on spotify

  • still relatively simple, generally only 1 or 2 instruments
  • very repetitive, and pattern based
  • could probably find MIDI-type files of similar music somewhere
  • might result from finetuning of piano model
  • July 20, 2023

    bro why did no one ever tell me about coffee

    played with some tts tools yesterday

  • ElevenLabs is insane, but probably too expensive

  • gonna focus on music generation first though, want to try to finish *something* before school

  • most likely will just be piano composition
  • i should build this

    https://twitter.com/0xgaut/status/1681977521129295872?s=46&t=q4dXlCcCLC0pPNRXGblygw

    but actually make it seem like texting a girl


    gpt4 is way too good

    had no idea how to write script to take midi file and convert it to a vector to use as training data

    and it instantly got it completely perfect

    i thought preparing the data was gonna be such a pain

    lets go

    July 19, 2023

    got llama-2 running last night, gonna try out 70B today since 13b was really fast

    trying to find best tts model, none of the good ones are open source though

    need to clone ScarJo's voice to be like Her


    Got two new books:

  • Exit voice and loyalty (balaji rec)
  • The beginning of infinity
  • July 18, 2023

    main problem with music generation is you have to encode a bunch of stuff for a single note

  • pitch, tempo, how hard you are pressing it
  • with text, its just character after character

    idk, will finish those papers today


    llama-2 released, gonna get that running later

    July 17, 2023

    Why hasn’t generative music caught on

  • Data is sequential just like text
  • More patterns = easier to learn
  • I’ve seen people try it with GANs before, but never with transformers


    going to try to automate as much work for internship as possible

    wondering if someone has made a tool to generate requirement docs (word docs)


    >be me

    >meet with client about changes to program

    >everything looks good, client makes small adjustments

    >come in to work today, boss wants to meet with client

    >now i have to wait, have nothing to work on

    is that how greentext works, im not really sure

    sike naw it got approved

    lets gooooo, i can work on sum


    Wondering if tiktok has an upload api

    Could create infinite stim videos

    forgot to bring airpods to work😭


    ideas of stuff to build:

  • youtube thumbnail generator
  • tiktok generator
  • homework solving (scan in pdf, answer q's w/ gpt4, text2handwriting)
  • trading bot
  • need to do something bigger
  • music editor(think ableton or logic)

  • https://developers.tiktok.com/doc/web-video-kit-with-web?enter_method=left_navigation

    also would be cool to do something in decentralized protocols

  • chess/go in something like nostr or farcaster

  • there is probably a lot of money to be made making AI girlfriends


    some interesting papers:

    https://arxiv.org/pdf/1808.03715.pdfhttps://arxiv.org/pdf/1809.04281.pdf

    July 16, 2023

    Got a sick domain for essays site

    https://essays.cool

    July 15, 2023

    testing upload to notes from site


  • from my iPhone
  • Would be cool to have macro keyboard

    Uses:

  • Got to gpt4
  • Open type.sh
  • Something for school probably
  • https://a.co/d/6eCyeRe

    need to start drinking coffee, so much untapped alpha


    Need to be working way harder

  • Need to read more and be on Twitter less
  • July 14, 2023

    cool papers is shipped

    visit here:

    https://cooltechpapers.com

    lmk what you think

    July 13, 2023


    gonna try to finish and ship cool papers today

    and maybe this too


    lgtm

    ok not gonna ship but basically done

    will ship this tho

    shipped