8 Comments

Great post! I didn't quite follow one thing: what does "a poor loss" mean?

Expand full comment

Oh, it just means a bad loss, i.e. it doesn't do a good job of predicting. I'll find a way to clarify, thanks.

Expand full comment

Great coverage on the absolutely abysmal specifications on input datasets. Hoping journalists start asking more questions there.

Expand full comment

Rob Miles on GPT "glitch tokens", a fun side effect of strange eddies in the input data: https://youtu.be/WO2X3oZEJOA

Expand full comment

awesome post! so well explained, love your writing style :)

Expand full comment

Thanks for this. It was definitely helpful, but tbh I was hoping it would help explain how LLMs so effectively transform at the micro-level "predict the next token" to the macro-level "blow people away with what-seems-to-be-incredible-creativity-and-shocking-levels-of-coherence."

To me it seems akin to the "it starts with generating the right cells" at the micro-level and transforms into "here's an intelligent human being made from those cells."

Expand full comment

I have a few thoughts on that, might write a post if there's interest. (I think that no one *really* know the answer, though!)

Expand full comment

That's kind of remarkable, right? I would like to inspire you to delve into this area of thought with some potential thought catalysts:

- If we've gotten to the point of programmatic execution complexity where we've kicked off a recursive capability-advancing process that we no longer fully understand or control, at what point do we convert the "?" into a more philosophical question of "when does a self-organizing system that lives in the digital realm start behaving like a self-organizing system in the biological realm?"

- If we accept that organic life is "merely" the result of pre-programmed DNA driving the organization of cells as it advances (and, with the help of biomatter, grows) into a hyper-complex unique-yet-prescriptive organism that has a distinctive visual and psychic identity, what stops us from removing the biomass part of the equation and seeing a digital analog (ha) to this story?

- On the flip side, how do we think about our seemingly innate impulses and enthusiasm for "discovering new things" and not over-rotate on what we're creating here by ascribing life-like characteristics to something purely in the digital realm? Talk about classic projection.

There is just an interesting idea here in the vein of the "Underpants -> ? -> Profit!" meme with LLMs specifically that I simply find fascinating and worthy of serious focus.

Expand full comment