Welcome to my personal notes!

agent is finally ~trained

highest score ive seen is 18, pretty solid

October 6, 2024

https://github.com/xjdr-alt/entropix

October 5, 2024

ppo agent is not training

could be hyperparameters, i have no idea though

holup

we lit rn

ok nevermind again

how does this even happen (average reward)

October 2, 2024

use nerf to scan your room, then re arrange furniture is ar using 3d models of your actual furniture

October 1, 2024

there is just no way crypto markets are as efficient as tradfi

those are probably last words before disaster but if would be really fun to try some kind of algorithmic trading

e.g. sentiment analysis of live broadcast, liquidity fluctuations of smaller coins

the question is how much these have already been commodified

algorithmic trading on polymarket would be fun to do for vp debate, a bit late to start on that though

i have too many projects in development

robotic hand (finally got new servos, need to print fingers + base)

flappy bird gamengen (not even done with rl part)

crypto stuff (uniswap replication, among other ideas)

hard to tell if rl model is training properly

it might be that rewards are too sparse (getting through first set of pipes is ~12 steps)

will let it run for an hour or two an come back

September 29, 2024

https://karpathy.github.io/2021/06/21/blockchain/

September 27, 2024

pretty close to being done with rl agent for flappy bird

that means i can almost start getting the training data for the gamengen

probably wont be able to work on it this weekend bc of mhacks

September 26, 2024

https://x.com/thegregyang/status/1839271130231877935

September 25, 2024

i wonder how much less efficient the economy is because of people's preference for round numbers

e.g. an interest rate on some account might be 3.5%, whereas a perfectly efficient market might resolve to 3.58529%

possible arb opportunity

September 19, 2024

goal for today is to integrate simulation with gym, and look at some implementations of PPO

also really need to get rest of robot hand designed, as well as buy new servos

September 18, 2024

alright i think i finally understand PPO

September 17, 2024

arduino motors aren't gonna work, i originally bought the wrong ones (not continuous), but usually you can just remove the connections to the potentiometer, but for some reason the ones I bought are soldered directly onto it, without wire

they weren't expensive tho, so not worst thing that could happen

September 16, 2024

i gotta finish this robot hand

ok built flappy bird in pygame, which should make integration with Gym easy

https://www.gymlibrary.dev/

September 15, 2024

https://ibionics.ece.ncsu.edu/assets/Publications/insect_machine_interface_based_neurocybernetics.pdf

September 14, 2024

this page is amazing:

https://spinningup.openai.com/en/latest/index.html

re: flappy bird agent

if my only goal was to build the flappy bird agent, and not to do the eventual gamengen, it would probably make sense for me to just build my own version of flappy bird

then i could train the model way faster, since i can kinda remove the time component

cant really do that unless my version of flappy bird looks nice (it actually looks like the real game), since if i am using the frames for training data, i would like my eventual simulated game to actually look nice

and not just black rectangles on a white screen

i guess that wouldnt be very hard though, its not like the game is hard to build, i just need to get the styling perfect

cbtm

September 13, 2024

https://gamegen-o.github.io/

September 12, 2024

agi just dropped

https://openai.com/index/learning-to-reason-with-llms/

better than deepmind proof/geometry models?!

https://arxiv.org/abs/2401.08967

ppo is actually pretty complicated

i know a lot less about rl than i thought

id like to start writing down everything i eat

its probably true that diet is way more important than exercise re: cognition, and infinitely more important than stuff like supplements

idk why i was so into longevity/health supplements (l-theanine, taurine, etc.) when i made 0 changes to diet

the most i ever did was fast, which was fun but not sustainable

also need to track calories, as i have lost ~7 pounds since being at school (not good)

September 11, 2024

going to recreate gamengen on flappy bird

first step is to build rl agent that plays by itself

need rl agent in order to get sufficient training data for actual diffusion model

agent needs to mimic human play though, so goal is not actually to train perfect flappy bird model, but a perfectly average one

in paper, rewards seem fairly arbitrary, so i guess i will have to just do trial/error

this is what they used for agent

https://arxiv.org/pdf/1707.06347
https://karpathy.github.io/2016/05/31/rl/

September 9, 2024

finger is done, rolling joint works well but the elastic coord wont be strong enough to actually hold stuff

probably fine though as goal for this is not to be useful, i just want to mimic my hand from webcam

i want to write first post about gamengen, but might be better to write first one on something i am already very familiar with

og diffusion model paper for example

September 8, 2024

there is probably a big market for a newsletter/blog that actively covers ml research in a technical way

like i still don’t have a good way of finding cool papers or hearing about interesting news outside of twitter

this actually cbtm, just making a substack where i do summaries of papers

goal would mostly be to keep me consistently reading stuff, but would be fun to try to grow it

September 6, 2024

Gaussian splat

you can take GameNGen architecure and simulate anything that has outputs dependent on both time and some input at a given time

for example you could simulate the OS of a computer

also seems like it could be interesting in music (given that music gen uses some kind of diffusion)

September 4, 2024

JAX cbtm

https://scottaaronson.blog/?p=8269

September 3, 2024

ok 3d model is done

why do people say CAD software is hard to use, this was very easy

September 2, 2024

generative gaming

will work on this after robot hand

ideally prints would be done this week, depends on how soon i can reserve printer

August 30, 2024

ok first step is to build single finger (2 joints but only 1 servo)

rolling contact joint seems like the move, though it may be more complicated when i try to do 1 servo for each joint

single finger is very simple theoretically, but i've never 3d printed anything, so time will tell

all i should need are the three printed sections of the finger, elastic coord, small servo, perhaps another arduino

goal for next ~2 months is to have hand fully built and controllable via webcam that reflects movement of my hand

August 29, 2024

https://huggingface.co/papers/2408.14837

August 27, 2024

https://www.youtube.com/watch?v=EA9mRS_-SC0

rolling contact joint

August 26, 2024

https://github.com/NousResearch/DisTrO/tree/main

August 19, 2024

on “a random walk down wall street”

haven’t finished it yet, but so far i have found this to be kinda ignorant

Malkiel reduces all “technical analysis” to either charts or stupid indicators like which team wins the super bowl

he acknowledges that there are some strategies that will do well for a short period until others figure it out, which he uses as evidence for why they don’t work over time

obviously a given strategy won’t work forever (alpha is temporary, it WILL become commoditized), but that does not make it less valuable??

and yeah its probably true that ordinary Joe’s strategy about the correlation of distinct corporate bonds won’t work, but it’s not because the market is a “random walk”, it’s because citadel figured that strategy out 10 years ago and has extracted all the alpha

i agree with the premise of the book(just buy indexes), but the argument Malkiel makes is wrong

it’s not that specific strategies don’t work, it’s that the strategies of retail investors are probably orders of magnitude less complicated and advanced than some prop shop

maybe malkiel talks more about this later in the book, but so far i am not a huge fan

becoming a dog at poker might be the move

at this point there is probably no alpha in gpt wrappers

image gen models however...

August 18, 2024

dominion by tom holland was basically articulated 12 years before in this:

https://www.unqualified-reservations.org/2007/06/ultracalvinist-hypothesis-in/

and moldbug likely did not come up with it, i wonder why its not that mainstream

August 17, 2024

pre-ordering gray mirror paperback might be the move

https://passage.press/products/gm-disturbance

this robot hand aint gonna build itself

August 15, 2024

https://arxiv.org/abs/2305.18290

August 11, 2024

https://arxiv.org/pdf/2203.09893

August 6, 2024

benchmarks✅

moving model onto SAELens seems to be done, need to do a bit more testing though

August 5, 2024

am having a surprisingly difficult time doing the benchmarks

https://huggingface.co/CAMB-AI/MARS5-TTS

August 4, 2024

today:

run benchmarks on SAE

continue work on TTS app

find papers/resources for music gen

finish setting up new laptop

look into how much training a SAE on all layers via cloud compute would cost

https://www.youtube.com/watch?v=Cr-5meLKOIo

first big goal is to do midi+prompt->audio, where prompt is something like "electric guitar" or "80s synth"

that combined with pitch detection on audio from humming would be really cool

August 3, 2024

need to build robot hand

August 2, 2024

time to revisit music gen

http://audition.ens.fr/adc/pdf/2002_JASA_YIN.pdf
https://arxiv.org/abs/2402.01618v1

control/style vectors are basically same thing as what i did with SAEs

July 30, 2024

https://www.neuronpedia.org/

July 29, 2024

finally got nuclear site up, removed all text, just left the charts

i think its better that way

https://www.endnrc.org/
https://machinelearning.apple.com/papers/apple_intelligence_foundation_language_models.pdf

July 28, 2024

https://explained.ai/matrix-calculus/

July 26, 2024

might spend this week doing some lighter stuff

maybe the tts app i've wanted to build for a while, should be pretty easy

July 25, 2024

today i am going to build a little cli to make inference with SAE easier (ability to see which features are firing, manually activate them, etc.)

i think making a frontend you can run locally using the eleuther sae might be the move

July 21, 2024

new blog post is up:

https://www.tylercosgrove.com/blog/exploring-sae/

July 19, 2024

now have proper chat set up, but i really need a more sophisticated feature finder

the only really solid one i have is the pacific ocean

July 18, 2024

re: trying to find golden gate feature

model isn't super big, so i doubt i'll be able to find one just for the golden gate

however, i have found a "pacific ocean" feature, and a "cities" feature

if i find a "bridge" neruon, and activate them all, i think it will work

TODO:

upload model to huggingface

find some cool features

ideally would make some kind of interactive web app, but might use Eleuther's Llama SAE to be more replicable (also theirs is probably better ngl)

July 17, 2024

ok, reconstructions are alright, but after ~two sentences model just repeats same thing over and over

> The Golden Bridge is a bridge that connects Los Angeles and San Francisco, California. It is one of the most famous brons in the United States and is considered a symbol of the American West. The bridge is located in the San Francisco area and is considered a symbol of the American West. The bridge is located in

i think there is probably something wrong with how i am doing inference, but i don't know what

found it, i forgot i had change the target layer to 16, i want replacing layer 24

model recon is perfect now!!!!

LGTM

found very rough Metro feature

> USER: What do you know a lot about?

> MODEL: Here are some things I know a lot about:\n Metro: The Metro is a system of underground transportation in cities, which uses trains to carry passengers.

i am so hype, model finally works!!!!!!!

i need to find the "golden gate bridge" feature

July 16, 2024

ok now my % dead neurons curve is just buggin

so ugly

gonna let it keep cooking though, neither mse loss nor the auxk loss have stalled out

going to setup wandb, i am sick and tired of tensorboard

i guess it is trending in the right direction though

i cant really tell what these big drops come from, perhaps my data is still not shuffled enough??

July 15, 2024

not really sure what to do at this point

reconstruction loss stalls out after about a day, and the aux loss seems to do little to prevent dead neurons

i am pretty sure that the only difference between my implementation and openai's is that the threshhold for dead neurons is much less?

i am at 100k steps, where openai used 10M

although i am unsure if their metric was training steps or actual tokens, because I would actually be at 25.6M (batch size is 256)

holup, number of dead neurons is decreasing???

maybe small changes yesterday had an effect, too soon to call though

yeah didn't work. now retrying to actually be 10M TOKENS, which means only ~39k instead of 100k

this might be the cause of why, once axuk kicks in, there are already so many dead neurons (i am starting auxk too late, as opposed to too early)

if this doesn't work, i wrote up an email to send to paper author as a last ditch effort

July 12, 2024

model looks pretty good now, very few dead neurons and activation frequency is very low(sparsity!)

will need to write new dataloader to look at features, since my current one doesnt save the actual tokens

actually there may be a lot of dead neurons

also, reconstruction actually isn't very good, after ~16 tokens it becomes terrible

alright i've cleaned everything up, if model doesn't work now idk what im gonna do

just gonna let it train all through tomorrow too

July 11, 2024

now model won't converge

reconstruction is really terrible:

base model: "Cars, also known as automobiles, are wheeled vehicles used for transportation. They are a common means of transport for..."

using sae reconstruction: "Cars, also known in and' the the, cars are cars. Cars are a car which cars cars cars cars cars cars..."

wait nevermind forgot to get rid of topk

REAL sae reconstruction: "Cars, or automobiles, are vehicles primarily designed for transporting people and goods, and they are a major means of..."

now i just need to make sure features are actually sparse (not sure how they wouldn't be)

features are not even close to sparse, i think topk activation does not work correctly😭

back to training😔

looking back, it was strange that there were 0 dead neurons

July 10, 2024

so i basically have to wait for a while to see if it works or not

loss is definitely smoother after correcting the data shuffling

loss curve still has weird artifacts

i think it still has to do with shuffling, as some text examples are really long, so even with shuffling lots of activations might contain similar features?

every large uptick in loss coincides with new set of examples

changed it to use 1/5 of each examples, so shuffle should be noticeably better

ideally, each activation would be from a totally different example at a totally different time step, but that would require either a ton of time spent doing ~inference on the base model or an insane amount of storage, neither of which i have

Previous page Next page