This awesome conversation with @stephenbalaban of @LambdaAPI is also available on Spotify, Apple Podcasts and here on YouTube:
这场与 @LambdaAPI 的 @stephenbalaban 的精彩对话,也可以在 Spotify、Apple Podcasts,以及这里的 YouTube 上收听/观看:
State of AI compute 2026: my conversation with @stephenbalaban of @LambdaAPI on the neocloud boom, data centers, GPUs and what's ahead 00:00 — Cold open 01:21 — Why GPU compute was never a commodity 02:45 — The H100 price index and what it gets wrong 04:02 — The real moat: technology or financing? 05:57 — Winner-take-all, or room for many neoclouds 06:48 — Are we overbuilding or underbuilding AI compute? 09:26 — What if AI gets 10x more compute-efficient? 10:44 — The real bottleneck: land, power, and shell 11:38 — The backlash against data centers — and the misinformation 15:00 — Opening the hood: from photons to tokens 17:11 — Extracting more value from the same chip 19:26 — Frontier inference and distributed training, explained 23:26 — What actually drives compute cost 25:21 — Lambda's chip stack and the NVIDIA relationship 26:17 — A multi-silicon world? CUDA, CUDNN, and NVIDIA's real moat 28:59 — Networking, storage, and the one-click cluster 34:46 — Renting vs. owning, and full vertical integration 36:24 — How global is Lambda? Does location still matter? 38:44 — The financing stack: off-take agreements, SPVs, and credit 41:16 — Why a 2023 GPU leases for more today 42:36 — A futures market for compute? 43:54 — Origin story: facial recognition, Perceptio, and Apple 47:03 — The Lambda hat and Dream Scope 48:59 — The $60K bet that became a cloud business 52:00 — Holding the team together through the hard times 54:30 — Bringing on a new CEO; Stephen as CTO 57:33 — Matching xAI on high-velocity deployment 59:29 — "AI won't write software — it will become the software" 01:01:30 — Neural software vs. vibe coding 01:04:25 — Do agents change the compute layer 01:06:14 — Self-assembling software inside Lambda 01:08:18 — Gigawatt-scale AI factories 01:08:57 — One person, one GPU 01:12:04 — Hot takes: overrated and underrated in AI
AI compute(算力)2026 现状:我与 @LambdaAPI 的 @stephenbalaban 关于 neocloud 热潮、data centers(数据中心)、GPUs 以及未来走向的对话 00:00 — 开场冷启 01:21 — 为什么 GPU 算力从来不是一种 commodity(标准化商品) 02:45 — H100 价格指数,以及它错在哪里 04:02 — 真正的 moat(护城河):技术还是融资? 05:57 — 赢者通吃,还是会有许多 neocloud 并存的空间 06:48 — 我们是在过度建设还是建设不足 AI 算力? 09:26 — 如果 AI 的算力效率提高 10 倍会怎样? 10:44 — 真正的瓶颈:土地、电力和 shell(机房壳体/基础设施) 11:38 — 针对 data centers 的反弹情绪,以及其中的错误信息 15:00 — 打开引擎盖:从 photons(光子)到 tokens 17:11 — 如何从同一块 chip(芯片)中榨取更多价值 19:26 — Frontier inference(前沿推理)与 distributed training(分布式训练)解析 23:26 — 真正驱动算力成本的因素是什么 25:21 — Lambda 的 chip stack(芯片栈)以及与 NVIDIA 的关系 26:17 — 一个 multi-silicon(多芯片/多架构)世界?CUDA、CUDNN,以及 NVIDIA 真正的 moat 28:59 — 网络、存储,以及 one-click cluster(一键集群) 34:46 — 租用 vs. 拥有,以及完全垂直整合 36:24 — Lambda 的全球化程度如何?地理位置还重要吗? 38:44 — 融资结构:off-take agreements(包销协议)、SPVs(特殊目的实体)和信贷 41:16 — 为什么一块 2023 年的 GPU 今天租价更高 42:36 — 算力会有 futures market(期货市场)吗? 43:54 — 起源故事:facial recognition(人脸识别)、Perceptio 和 Apple 47:03 — Lambda 帽子与 Dream Scope 48:59 — 那笔 6 万美元的赌注,如何变成了一门 cloud(云)业务 52:00 — 在艰难时期让团队保持团结 54:30 — 引入新 CEO;Stephen 担任 CTO 57:33 — 如何在高速部署上匹配 xAI 59:29 — “AI 不会编写软件——它会成为软件本身” 01:01:30 — Neural software(神经软件)vs. vibe coding 01:04:25 — agents(智能体)会改变 compute layer(算力层)吗 01:06:14 — Lambda 内部可自组装的软件 01:08:18 — 吉瓦级 AI 工厂 01:08:57 — 一人一 GPU 01:12:04 — 犀利观点:AI 里哪些被高估了,哪些被低估了