🐦 X · 动态Matt Turck @mattturck· 2026 年 6 月 4 日· 247 词 · 约 1 分钟

Matt Turck · @mattturck

SPACE 播放 / 暂停←→ 上一句 / 下一句

This great conversation with @danintheory of @OpenAI is also available on Spotify, Apple Podcasts and here on YouTube:

这场与 @OpenAI 的 @danintheory 的精彩对话，也可以在 Spotify、Apple Podcasts，以及这里的 YouTube 上收听或观看：

♥ 10↻ 0💬 0x.com ↗

Why AI Can Now Make Discoveries - my conversation with @danintheory, Lead of the Foundations of Reinforcement Learning team at @OpenAI 00:00 Intro: AI's wild week in mathematics 01:21 What OpenAI's Foundations of RL team does 03:08 Dan's journey: from black holes and quantum gravity to frontier AI 07:04 Are AI systems becoming useful for real science 08:21 The AI math moment: Erdős, OpenAI, DeepMind, and Anthropic 08:52 Why the OpenAI result was an act of exploration 10:25 OpenAI vs. DeepMind: informal reasoning vs. formal proof 12:13 RL 101: learning by doing, not just watching 15:10 Why reinforcement learning works 15:58 How RL breaks: sparse feedback and long-horizon tasks 17:03 RLHF: how human feedback shaped early language models 18:48 Move 37, self-play, and the search for novel strategies 22:16 Explore vs. exploit in scientific discovery 24:49 Why RL may now be "the cake," not the cherry on top 25:46 Why RL started working with large language models 27:29 Is RL "sucking supervision through a straw"? 28:47 Why language may be the grounding layer for intelligence 31:46 A contrarian take on the Bitter Lesson 32:41 What test-time compute actually is 34:50 How RL gives models the ability to think 35:40 Verifiable rewards, math, coding, and the messy real world 38:00 What physics can teach us about AI 42:08 Is there a thermodynamics of AI? 43:08 From Erdős problems to Einstein-level AI 45:16 Is AI already doing original science? 45:51 How far are we from AI automating AI research 47:41 Why Dan is excited about the future of science

为什么 AI 现在能够做出发现——我与 @danintheory 的对话；他是 @OpenAI Foundations of Reinforcement Learning team 的负责人。00:00 开场：AI 在数学领域疯狂的一周 01:21 OpenAI 的 Foundations of RL 团队是做什么的 03:08 Dan 的经历：从 black holes 和 quantum gravity 到前沿 AI 07:04 AI 系统是否正在变得对真实科学有用 08:21 AI 的数学时刻：Erdős、OpenAI、DeepMind 和 Anthropic 08:52 为什么 OpenAI 的结果是一种探索行为 10:25 OpenAI vs. DeepMind：非形式化推理 vs. 形式化证明 12:13 RL 入门：通过实践学习，而不只是通过观察 15:10 为什么 reinforcement learning 有效 15:58 RL 如何失效：稀疏反馈与长时程任务 17:03 RLHF：human feedback 如何塑造了早期 language models 18:48 Move 37、self-play，以及对新颖策略的探索 22:16 科学发现中的 explore vs. exploit 24:49 为什么 RL 现在可能是“蛋糕本体”，而不只是顶上的樱桃 25:46 为什么 RL 开始在 large language models 上奏效 27:29 RL 是否是在“用吸管吸取 supervision”？28:47 为什么 language 可能是 intelligence 的 grounding layer 31:46 对 Bitter Lesson 的一个反常识观点 32:41 test-time compute 到底是什么 34:50 RL 如何赋予 models 思考的能力 35:40 可验证奖励、数学、coding，以及混乱的现实世界 38:00 physics 能教会我们关于 AI 的什么 42:08 AI 存在 thermodynamics 吗？43:08 从 Erdős 难题到 Einstein 级别的 AI 45:16 AI 是否已经在做原创科学 45:51 我们距离 AI 自动化 AI research 还有多远 47:41 为什么 Dan 对科学的未来感到兴奋

♥ 63↻ 6💬 4x.com ↗

原文 ↗https://x.com/mattturck

🐦 X · 动态Matt Turck @mattturck· 2026 年 6 月 4 日· 247 词 · 约 1 分钟

Matt Turck · @mattturck

SPACE 播放 / 暂停←→ 上一句 / 下一句

This great conversation with @danintheory of @OpenAI is also available on Spotify, Apple Podcasts and here on YouTube:

这场与 @OpenAI 的 @danintheory 的精彩对话，也可以在 Spotify、Apple Podcasts，以及这里的 YouTube 上收听或观看：

♥ 10↻ 0💬 0x.com ↗

♥ 63↻ 6💬 4x.com ↗

原文 ↗https://x.com/mattturck