2 Comments

Why the skepticism? Since 2017, it’s been all about more data and more compute.

Expand full comment

Miraculous results mainly referring to people who think we’ll hit AGI just by continuing to scale LLMs or other current models.

Besides that, metrics are kind of hard to compare apples-to-apples (and as you get better and better it’s harder to improve), but most have been leveling off with just more data and compute.

2017 is kind of a specific year since that’s the one where “Attention is All You Need” was published, but before transformers just throwing more data and compute wasn’t going to work that well from a practical sense with recurrent neural networks or whatnot for LLMs. There’s likely also going to be other innovations as we go whether it’s gradient free stuff, the ternary stuff, or KANs, etc.

Expand full comment