Rendered at 15:30:50 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
markusMB 6 hours ago [-]
Beautiful illustrations
I find, 'Playing' is just the free and motivated version of 'exploration'.
One thought on your nicely illustrated "key observation [is] that neural networks tend to place features along directions": my guess is that the neural net was TOLD to behave that way by choosing e.g. Cosine Loss?
archermarks 4 hours ago [-]
Nice article! The generated images make me so nostalgic for the early days of AI image generation. DeepDream and others had such uncanny, interesting generations.
vintermann 2 hours ago [-]
Yeah, generative AI used to be wild, alien creativity and not something that made art kids furious.
I wonder if models can be trained for "high-temperature" purposes. I'd rather have a model which can surprise me than one which can predicably produce generic mediocre results. I mean you can run them on high temperature of course, but it doesn't seem like it's optimized for that.
RealityVoid 6 hours ago [-]
For some reason, the uncanniness of the feature pictures are deeply unsettling for me. It just stirs intense unease. A bit amusing, to be honest.
joaquincabezas 2 hours ago [-]
This article is very well structured and provides just the right amount of details for non-practitioners to enjoy it.
Mechanistic interpretability is a fun topic to "play with" (good title there). I recommend watching videos featuring Neel Nanda or Chris Olah
jcattle 8 hours ago [-]
Very nice visualizations, thanks for that!
One thing I still struggle with in my head is how these vision embeddings can then be used to give LLMs eyes.
Because you somehow need a giant training set which describes images in natural language, no? Is that actually how it works, or is there some smart trick so you don't need to pay labellers a bunch of money to look at pictures and describe them.
dilyevsky 7 hours ago [-]
> Because you somehow need a giant training set which describes images in natural language, no?
That's definitely one way - they train a text encoder together with an image encoder on a labelled set of images. WL & 3b1b made a nice video on it: https://www.youtube.com/watch?v=iv-5mZ_9CPY
jcattle 7 hours ago [-]
Thanks I'll check out that video
agentbraker 2 hours ago [-]
Awesome project! Preserving and sharing knowledge like this is incredibly valuable. Thanks for making these resources accessible to everyone.
One thought on your nicely illustrated "key observation [is] that neural networks tend to place features along directions": my guess is that the neural net was TOLD to behave that way by choosing e.g. Cosine Loss?
I wonder if models can be trained for "high-temperature" purposes. I'd rather have a model which can surprise me than one which can predicably produce generic mediocre results. I mean you can run them on high temperature of course, but it doesn't seem like it's optimized for that.
Mechanistic interpretability is a fun topic to "play with" (good title there). I recommend watching videos featuring Neel Nanda or Chris Olah
One thing I still struggle with in my head is how these vision embeddings can then be used to give LLMs eyes.
Because you somehow need a giant training set which describes images in natural language, no? Is that actually how it works, or is there some smart trick so you don't need to pay labellers a bunch of money to look at pictures and describe them.
That's definitely one way - they train a text encoder together with an image encoder on a labelled set of images. WL & 3b1b made a nice video on it: https://www.youtube.com/watch?v=iv-5mZ_9CPY