Steve Newman

Posts

Sorted by New

1Steve Newman's Shortform2y

2AI Policy Should Prioritize Visibility Into Trajectories2y

Wiki Contributions

Comments

Steve Newman's Shortform

Steve Newman2y1

This is a linkpost for https://amistrongeryet.substack.com/p/alphaproof-and-openai-o1

The latest advances in AI reasoning come from OpenAI's o1 and Google's AlphaProof. In this post, I explore how these new models work, and what that tells us about the path to AGI.

Interestingly, unlike GPT-2 -> GPT-3 -> GPT-4, neither of these models rely on increased scale to drive capabilities. Instead, both systems rely on training data that shows, not just the solution to a problem, but the path to that solution. This opens a new frontier for progress in AI capabilities: how to create that sort of data?

In this post, I review what is known about how AlphaProof and o1 work, discuss the connection between their training data and their capabilities, and identify some problems that remain to be solved in order for capabilities to continue to progress along this path.