Welcome to my personal notes!
agent is finally ~trained
highest score ive seen is 18, pretty solid
October 6, 2024
October 5, 2024
ppo agent is not training
could be hyperparameters, i have no idea though
holup
we lit rn
ok nevermind again

how does this even happen (average reward)
October 2, 2024
use nerf to scan your room, then re arrange furniture is ar using 3d models of your actual furniture
use nerf to scan your room, then re arrange furniture is ar using 3d models of your actual furniture
October 1, 2024
there is just no way crypto markets are as efficient as tradfi
those are probably last words before disaster but if would be really fun to try some kind of algorithmic trading
e.g. sentiment analysis of live broadcast, liquidity fluctuations of smaller coins
the question is how much these have already been commodified
algorithmic trading on polymarket would be fun to do for vp debate, a bit late to start on that though
i have too many projects in development
hard to tell if rl model is training properly
it might be that rewards are too sparse (getting through first set of pipes is ~12 steps)
will let it run for an hour or two an come back
September 29, 2024
https://karpathy.github.io/2021/06/21/blockchain/September 27, 2024
pretty close to being done with rl agent for flappy bird
that means i can almost start getting the training data for the gamengen
probably wont be able to work on it this weekend bc of mhacks
September 26, 2024
https://x.com/thegregyang/status/1839271130231877935September 25, 2024
i wonder how much less efficient the economy is because of people's preference for round numbers
e.g. an interest rate on some account might be 3.5%, whereas a perfectly efficient market might resolve to 3.58529%
possible arb opportunity
September 19, 2024
goal for today is to integrate simulation with gym, and look at some implementations of PPO
also really need to get rest of robot hand designed, as well as buy new servos
September 18, 2024
alright i think i finally understand PPO
September 17, 2024
arduino motors aren't gonna work, i originally bought the wrong ones (not continuous), but usually you can just remove the connections to the potentiometer, but for some reason the ones I bought are soldered directly onto it, without wire
they weren't expensive tho, so not worst thing that could happen
September 16, 2024
i gotta finish this robot hand
ok built flappy bird in pygame, which should make integration with Gym easy
https://www.gymlibrary.dev/September 15, 2024
https://ibionics.ece.ncsu.edu/assets/Publications/insect_machine_interface_based_neurocybernetics.pdfSeptember 14, 2024
this page is amazing:
https://spinningup.openai.com/en/latest/index.htmlre: flappy bird agent
if my only goal was to build the flappy bird agent, and not to do the eventual gamengen, it would probably make sense for me to just build my own version of flappy bird
then i could train the model way faster, since i can kinda remove the time component
cant really do that unless my version of flappy bird looks nice (it actually looks like the real game), since if i am using the frames for training data, i would like my eventual simulated game to actually look nice
and not just black rectangles on a white screen
i guess that wouldnt be very hard though, its not like the game is hard to build, i just need to get the styling perfect
cbtm
September 13, 2024
https://gamegen-o.github.io/September 12, 2024
agi just dropped
https://openai.com/index/learning-to-reason-with-llms/better than deepmind proof/geometry models?!
ppo is actually pretty complicated
i know a lot less about rl than i thought
id like to start writing down everything i eat
its probably true that diet is way more important than exercise re: cognition, and infinitely more important than stuff like supplements
idk why i was so into longevity/health supplements (l-theanine, taurine, etc.) when i made 0 changes to diet
the most i ever did was fast, which was fun but not sustainable
also need to track calories, as i have lost ~7 pounds since being at school (not good)
September 11, 2024
going to recreate gamengen on flappy bird
first step is to build rl agent that plays by itself
need rl agent in order to get sufficient training data for actual diffusion model
agent needs to mimic human play though, so goal is not actually to train perfect flappy bird model, but a perfectly average one
in paper, rewards seem fairly arbitrary, so i guess i will have to just do trial/error
this is what they used for agent
https://arxiv.org/pdf/1707.06347September 9, 2024
finger is done, rolling joint works well but the elastic coord wont be strong enough to actually hold stuff
probably fine though as goal for this is not to be useful, i just want to mimic my hand from webcam
i want to write first post about gamengen, but might be better to write first one on something i am already very familiar with
og diffusion model paper for example
September 8, 2024
there is probably a big market for a newsletter/blog that actively covers ml research in a technical way
like i still don’t have a good way of finding cool papers or hearing about interesting news outside of twitter
this actually cbtm, just making a substack where i do summaries of papers
goal would mostly be to keep me consistently reading stuff, but would be fun to try to grow it
September 6, 2024
Gaussian splat
you can take GameNGen architecure and simulate anything that has outputs dependent on both time and some input at a given time
for example you could simulate the OS of a computer
also seems like it could be interesting in music (given that music gen uses some kind of diffusion)
September 4, 2024
JAX cbtm
https://scottaaronson.blog/?p=8269September 3, 2024
ok 3d model is done
why do people say CAD software is hard to use, this was very easy
September 2, 2024
generative gaming
will work on this after robot hand
ideally prints would be done this week, depends on how soon i can reserve printer
August 30, 2024
ok first step is to build single finger (2 joints but only 1 servo)
rolling contact joint seems like the move, though it may be more complicated when i try to do 1 servo for each joint
single finger is very simple theoretically, but i've never 3d printed anything, so time will tell
all i should need are the three printed sections of the finger, elastic coord, small servo, perhaps another arduino
goal for next ~2 months is to have hand fully built and controllable via webcam that reflects movement of my hand
August 29, 2024
https://huggingface.co/papers/2408.14837August 27, 2024
https://www.youtube.com/watch?v=EA9mRS_-SC0rolling contact joint
August 26, 2024
https://github.com/NousResearch/DisTrO/tree/mainAugust 19, 2024
on “a random walk down wall street”
haven’t finished it yet, but so far i have found this to be kinda ignorant
Malkiel reduces all “technical analysis” to either charts or stupid indicators like which team wins the super bowl
he acknowledges that there are some strategies that will do well for a short period until others figure it out, which he uses as evidence for why they don’t work over time
obviously a given strategy won’t work forever (alpha is temporary, it WILL become commoditized), but that does not make it less valuable??
and yeah its probably true that ordinary Joe’s strategy about the correlation of distinct corporate bonds won’t work, but it’s not because the market is a “random walk”, it’s because citadel figured that strategy out 10 years ago and has extracted all the alpha
i agree with the premise of the book(just buy indexes), but the argument Malkiel makes is wrong
it’s not that specific strategies don’t work, it’s that the strategies of retail investors are probably orders of magnitude less complicated and advanced than some prop shop
maybe malkiel talks more about this later in the book, but so far i am not a huge fan
becoming a dog at poker might be the move
at this point there is probably no alpha in gpt wrappers
image gen models however...
August 18, 2024
dominion by tom holland was basically articulated 12 years before in this:
https://www.unqualified-reservations.org/2007/06/ultracalvinist-hypothesis-in/and moldbug likely did not come up with it, i wonder why its not that mainstream
August 17, 2024
pre-ordering gray mirror paperback might be the move
https://passage.press/products/gm-disturbancethis robot hand aint gonna build itself
August 15, 2024
https://arxiv.org/abs/2305.18290August 11, 2024
https://arxiv.org/pdf/2203.09893August 6, 2024
benchmarks✅
moving model onto SAELens seems to be done, need to do a bit more testing though
August 5, 2024
am having a surprisingly difficult time doing the benchmarks
August 4, 2024
today:
first big goal is to do midi+prompt->audio, where prompt is something like "electric guitar" or "80s synth"
that combined with pitch detection on audio from humming would be really cool
August 3, 2024
need to build robot hand
August 2, 2024
time to revisit music gen
http://audition.ens.fr/adc/pdf/2002_JASA_YIN.pdfcontrol/style vectors are basically same thing as what i did with SAEs
July 30, 2024
https://www.neuronpedia.org/July 29, 2024
finally got nuclear site up, removed all text, just left the charts
i think its better that way
https://www.endnrc.org/July 28, 2024
https://explained.ai/matrix-calculus/July 26, 2024
might spend this week doing some lighter stuff
maybe the tts app i've wanted to build for a while, should be pretty easy
July 25, 2024
today i am going to build a little cli to make inference with SAE easier (ability to see which features are firing, manually activate them, etc.)
i think making a frontend you can run locally using the eleuther sae might be the move
July 21, 2024
new blog post is up:
https://www.tylercosgrove.com/blog/exploring-sae/July 19, 2024
now have proper chat set up, but i really need a more sophisticated feature finder
the only really solid one i have is the pacific ocean
July 18, 2024
re: trying to find golden gate feature
model isn't super big, so i doubt i'll be able to find one just for the golden gate
however, i have found a "pacific ocean" feature, and a "cities" feature
if i find a "bridge" neruon, and activate them all, i think it will work

TODO:
July 17, 2024
ok, reconstructions are alright, but after ~two sentences model just repeats same thing over and over
> The Golden Bridge is a bridge that connects Los Angeles and San Francisco, California. It is one of the most famous brons in the United States and is considered a symbol of the American West. The bridge is located in the San Francisco area and is considered a symbol of the American West. The bridge is located in
i think there is probably something wrong with how i am doing inference, but i don't know what
found it, i forgot i had change the target layer to 16, i want replacing layer 24
model recon is perfect now!!!!
found very rough Metro feature
> USER: What do you know a lot about?
> MODEL: Here are some things I know a lot about:\n Metro: The Metro is a system of underground transportation in cities, which uses trains to carry passengers.
i am so hype, model finally works!!!!!!!
i need to find the "golden gate bridge" feature
July 16, 2024
ok now my % dead neurons curve is just buggin

so ugly
gonna let it keep cooking though, neither mse loss nor the auxk loss have stalled out
going to setup wandb, i am sick and tired of tensorboard
i guess it is trending in the right direction though

i cant really tell what these big drops come from, perhaps my data is still not shuffled enough??
July 15, 2024
not really sure what to do at this point
reconstruction loss stalls out after about a day, and the aux loss seems to do little to prevent dead neurons
i am pretty sure that the only difference between my implementation and openai's is that the threshhold for dead neurons is much less?
i am at 100k steps, where openai used 10M
although i am unsure if their metric was training steps or actual tokens, because I would actually be at 25.6M (batch size is 256)
holup, number of dead neurons is decreasing???
maybe small changes yesterday had an effect, too soon to call though
yeah didn't work. now retrying to actually be 10M TOKENS, which means only ~39k instead of 100k
this might be the cause of why, once axuk kicks in, there are already so many dead neurons (i am starting auxk too late, as opposed to too early)
if this doesn't work, i wrote up an email to send to paper author as a last ditch effort
July 12, 2024
model looks pretty good now, very few dead neurons and activation frequency is very low(sparsity!)
will need to write new dataloader to look at features, since my current one doesnt save the actual tokens
actually there may be a lot of dead neurons
also, reconstruction actually isn't very good, after ~16 tokens it becomes terrible
alright i've cleaned everything up, if model doesn't work now idk what im gonna do
just gonna let it train all through tomorrow too
July 11, 2024
now model won't converge
reconstruction is really terrible:
now i just need to make sure features are actually sparse (not sure how they wouldn't be)
back to training😔
looking back, it was strange that there were 0 dead neurons
July 10, 2024
so i basically have to wait for a while to see if it works or not
loss is definitely smoother after correcting the data shuffling
loss curve still has weird artifacts

i think it still has to do with shuffling, as some text examples are really long, so even with shuffling lots of activations might contain similar features?
every large uptick in loss coincides with new set of examples
changed it to use 1/5 of each examples, so shuffle should be noticeably better
ideally, each activation would be from a totally different example at a totally different time step, but that would require either a ton of time spent doing ~inference on the base model or an insane amount of storage, neither of which i have