Thanks for this. It was definitely helpful, but tbh I was hoping it would help explain how LLMs so effectively transform at the micro-level "predict the next token" to the macro-level "blow people away with what-seems-to-be-incredible-creativity-and-shocking-levels-of-coherence."
To me it seems akin to the "it starts with generating the right cells" at the micro-level and transforms into "here's an intelligent human being made from those cells."
Great post! I didn't quite follow one thing: what does "a poor loss" mean?
Great coverage on the absolutely abysmal specifications on input datasets. Hoping journalists start asking more questions there.
awesome post! so well explained, love your writing style :)
Thanks for this. It was definitely helpful, but tbh I was hoping it would help explain how LLMs so effectively transform at the micro-level "predict the next token" to the macro-level "blow people away with what-seems-to-be-incredible-creativity-and-shocking-levels-of-coherence."
To me it seems akin to the "it starts with generating the right cells" at the micro-level and transforms into "here's an intelligent human being made from those cells."