Welcome to my personal notes!
February 19, 2024
https://www.lesswrong.com/posts/bSwdbhMP9oAWzeqsG/openai-s-sora-is-an-agenthopefully sora paper comes out soon
February 16, 2024
lord if you're up there let these gradients flow
i am sick and tired of writing this vqvae
let my codebook learnðŸ˜ðŸ˜
would be fun little project to make spanishdict for french, using llms
February 15, 2024
i need to take bigger bets on contrarian opinions i have
robotics is probably the best field to go into right now; i don't know anything about it
i dont know anything about hardware
i barely even know how electricity works
i need to maximize time spent learning important things, minimize everything else
i am assuming i know what is valuable (i have been generally correct in the past—at least in the context of school)
February 13, 2024
https://terrytao.wordpress.com/February 8, 2024
😠why won't my gradients flow
ok nevermind they were just scaled weird
nevermind again these gradients are not flowing
there are too many notes on this page, it is starting to act weird
need to limit to something like 250, and then maybe have a "next page" button at the bottom
just cutting off after the 1000 most recent for now though
February 5, 2024
ok finally understand what a VQGAN does
am going to implement it, then add it to my normal diffusion model
also for the toy autoencoder i made, i forgot to add activation and norm blocks for some reason
need to finish the jobs biography so i can start atlas shrugged
this vq encoder/decoder buggin
February 2, 2024
it works ok, not sure if it is just because of small dimensions or i need a bigger model
should be pretty simply to implement into the actual model though
my autoencoder is just a bunch of conv layers and then conv tranposed layers, with simlpe mse
gonna see what actual paper used now
this is the paper im referencing
https://arxiv.org/pdf/2112.10752.pdfbest thing about gpt4 is when you explain something to it so you can see if you're right or not
February 1, 2024
bought the caffiene, taurine, and l-theanine last night
apparently l-theanine has noticeable effects even when taken alone
time will tell
for supplements that "increase brain function" a lot of the literature just says it increase oxygenation
implying that oxygenation is way upstream of everything
being outside is the best supplement
https://near.blog/supplements/going to build latent diffusion model before i do actual music model
because it seems like my images (512x1001) are way to big to do normal diffusion on
should be fairly straightforward, goal is to have it trained by sunday
might just grind it out tonight though
haven't done that in a while
caffeine pills haven't come in yet, so might have to hit a cheeky redbull run
first step: VAE
before i look up actual implementations, just gonna cook up what i think they will be
January 31, 2024
finally got mnist diffusion up on website
that too way too long
it is still really slow
for the actual music app, i will have to actually learn how to host models
no way that took me 10 days to actually ship
i am not working nearly enough on this
January 29, 2024
https://near.blog/leveraged-etfs/never heard about these before
going to go vegetarian this week
January 28, 2024
saw a tweet about how you can compile cpp code into web asm
https://webassembly.org/https://t.co/DHQd4EVcmcJanuary 27, 2024
recognizing complacency in yourself might be the first step, but not the most important
January 25, 2024
i hate aws
January 24, 2024
got anki on my pc
goal is to be able to watch a French movie before summer w/o subtitles
or read le petite prince (this should be easier)
January 23, 2024
that is essentially the good outcome
bad outcome:
most orgs devolve into massive bureaucracies
standard of living slightly increases, but jobs become very mundane
most people are addicted to phones/entertainment a la Infinite Jest
honestly the main difference between the two is centralization
most decentralized = more people can use it how they want = free market = better for the masses
January 22, 2024
if agi actually really close, this is what I think
short term: white collar job market gets bad
wealth gap increases massively
basic standard of living also gets way better
long term: more artists, creators
some sort of UBI
January 21, 2024
out on the other side of aws hell, lambda is too slow (probably my fault)
gonna try something new
got a jank setup running flask on ec2
way faster tho
might grind out the whole post tonight
realized my youtube intake has drastically plummeted
consumption is still good if high quality (books, some movies, some podcasts)
you can buy caffeine extract, taurine, and glucuronolactone on amazon (stimulants used in redbull)
might cook up a home brew
writing with left hand is becoming easier
got the mnist post up, model is still kinda slow
nevermind, http means it doesnt work on prod
January 20, 2024
since model is so small, it actually runs on cpu relatively fast
so i don't need expensive gpu servers :)
time to break out the good ol' lambda function image that has pytorch installed
totally forgot about the pytorch game, that was a pretty cool project i should really finish
gonna write it in a flask server before i get bogged down in aws hell
January 19, 2024
need to be working way harder on music gen
this weekend will have demo of MNIST diffusion on website
i need to get some more posts on there
i haven't shipped in months
lets goooooo
results are pretty good, gonna scale it up a lil though
wondering the best way to host this
easiest would probably be something like replicate
recap on fast:

seems like i have a case of "singularity stress" (coined by yacine, i think)
January 18, 2024
agi is near, better prepare
although idk how to do that
purpose of this generation is to take us from where we are to limitless abundance once we have agi
all white collar work is completely automated in ~10 years
and that is conservative
anything that happens solely online will be automated within 5
next big step is robotics
after that, if implemented correctly(!), abundance is achieved
it’s time to build
for a couple years though, there is going to be mass unemployment
people will flock to trades, then that will fall
building wealth now is probably the most important thing you can do
as nice as libertarianism sounds, universal basic income is probably necessary in some form
open source ai is the most important thing to be working on
massive leverage in the hands of a few companies is not going to turn out well
January 17, 2024
day 4 of fasting
feeling pretty great
yesterday was definitely worse, I felt way more tired and weak
probably am going to do one more day
January 15, 2024
isn't college where you go to become radicalized
why is this not happening
feels like i'm missing out
day 2 of fasting
tired and fairly hungry, nothing too bad yet though
January 14, 2024
day 1 of the fast
feeling good so far
best way to understand math in ml paper is just derive everything yourself
gives you way better understanding when looking at the code
January 12, 2024
before i do diffusion model for my audio images, i'll start with mnist
seriously doubt i'll be able to train model on my local gpu, since images will be order of magnitude larger than mnist
time will tell
January 11, 2024
wonder if you could apply VAEs to text models
the latent vector would then not contain information about an image, but about some text
it would be the pure distilled information, like a thought
not sure whether you could actually do this, but having language model do the "thinking" in some latent space, and then translating that into english seems interesting
this latent information would be passed to the encoder block of the transformer
so the analog is first it will think up a solution in vector space, and then articulate it into words
really cool book i just found:
https://venhance.github.io/napkin/Napkin.pdfgonna take all notes this semester with my left hand
pretty sure by the end I’ll be totally ambidextrous
January 10, 2024
ai "devices"(humane,rabbit,etc.) are cool toy projects
if they cannot completely replace your phone, they are useless, and will be completely replaced by siri-like features on smartphones
i think the tipping point is when they start to prompt you (al la Her)
good video on diffusion models
https://www.youtube.com/watch?v=W-O7AZNzbzQhttps://arxiv.org/pdf/2006.11239.pdfhttps://arxiv.org/pdf/2105.05233.pdfhttps://arxiv.org/pdf/2102.09672.pdfJanuary 8, 2024
demucs is so fast on gpu 🤑
should be able to have all train/test data ready by tonight
definitely need to look into which kinds of architecture to use (some kind of diffusion, but the actual specifics)
may have small problem in that the beginning and the end of a song usually wont have drums
i guess i could just delete the first and last n images tho
cbtm
January 3, 2024
https://pytorch.org/audio/stable/transforms.htmlhttps://blog.samaltman.com/advice-for-ambitious-19-year-oldsgoal for today is to write script that takes single audio file, and turns in into N spectrograms that are 10 seconds long
seems like a useful dataset to start with/train baby model on
https://sigsep.github.io/datasets/musdb.html#musdb18-compressed-stemson cpu, demucs runs at about 2x song duration
January 2, 2024
https://near.blog/my-favorite-links/transcribing to midi is harder than I thought, especially for percussion
generating spectrograms with diffusion may work better
idk cbtm
once loop is generated, could then just transcribe that audio clip
so pipeline looks like this:
> get audio files
> separate into layers
> convert audio to spectrogram
> use img gen models to create new spectrograms
results from SD sound pretty good here
yeah training diffusion model on spectrogram is definitely the move
January 1, 2024
first step is getting the data
datasets below are okay, but i'll probably need to get some myself
will likely need model that turns audio into midi (which has already been solved)
these models work really well for audio recording of single piano, but more complex songs w/ multiple instruments may be difficult
end goal of data collection is to have discrete groups of midi files that just contain single ~instruments (drums, lead, rhythm)
midi approach should work perfectly for drums/percussion, lead/melody may need different strategy
seems promising
nevermind it breaks down with multiple instruments
there are ways to separate instruments though, just need to find open source model
https://github.com/deezer/spleeterpipeline now looks like this:
> get large number of audio files(mp3, wav)
> split them into track layers (voice, drums, melody)
> turn these into midi files
> train model on single type of track layer
seems to be sota oss model
demucs works but is very slow (might change when running on gpu)
problem is now that audio -> midi does not work for percussion, need to find new model
https://github.com/magenta/mt3