Speaker 100:00 - 00:36
It's pretty clear that we have an amazing system that can take in money and output software. The people who are the naysayers, you're gonna throw these GPUs out in five years, are completely wrong. They're completely wrong, and they've been wrong the entire time. We continue to be generally underbuilding. Most people that are sort of in leadership positions at Neo Clouds or within the market have been recognizing this insatiable amount of demand for large language models to do everything from being an assistant to code generation, we continue to see no end to the scaling laws. Speaker 100:00 - 00:36
很明显,我们拥有一个了不起的系统:它可以吸收资金,然后产出软件。那些唱反调的人说,五年后你们就会把这些 GPU 扔掉,这完全错了。他们完全错了,而且一直都错了。我们总体上仍然是在建设不足。无论是在 Neo Clouds 的领导岗位上,还是在整个市场中的大多数人,都已经意识到,对 large language models(大语言模型)的需求是无止境的——从做 assistant(助手)到 code generation(代码生成),无所不包;而我们也仍然看不到 scaling laws(规模定律)的尽头。
Speaker 200:36 - 00:50
Hi. I'm Matt Turk from Firstmark. Welcome to the MAD podcast. My guest today is Steven Balabam, cofounder and CTO of Lambda, one of the top neo clouds powering the AI boom. This episode goes deep on the physical layer that everything else in AI runs on. Speaker 200:36 - 00:50
大家好,我是来自 Firstmark 的 Matt Turk。欢迎来到 MAD podcast。今天我的嘉宾是 Steven Balabam,Lambda 的联合创始人兼 CTO。Lambda 是驱动这波 AI 热潮的顶级 neo clouds(新型云服务商)之一。这一期会深入探讨 AI 一切上层能力所依赖的物理层。
Speaker 200:50 - 01:35
We get into why GPU compute was never actually a commodity, how you finance billions of dollars of data centers and chips, why a 2,023 h 100 can be more expensive to lease today than when it was bought, and what it actually takes to stand up a gigawatt scale AI factory. We also cover Lambda's wild origin story from a facial recognition startup to a baseball cap with a camera in it to a near billion dollar cloud business today. Please enjoy this amazing and very educational conversation with Steven. There was a moment in time in Silicon Valley a few years ago. If you had asked most people, they would have said that neo clouds were going to be a commodity, in particular, because GPU compute was going to get commoditized. Speaker 200:50 - 01:35
我们会聊到:为什么 GPU compute(GPU 算力)其实从来都不是 commodity(大宗商品式、可完全同质化)的;你如何为价值数十亿美元的数据中心和芯片融资;为什么一块 2023 年的 h 100 今天租出去的价格,可能比当初买入时还贵;以及,真正建起一座 gigawatt scale(吉瓦级)的 AI factory(AI 工厂)到底需要什么。我们还会讲到 Lambda 非常传奇的起源故事:从一家 facial recognition(人脸识别)创业公司,到一顶内置摄像头的棒球帽,再到如今一家估值接近十亿美元的 cloud(云)业务。请享受这场精彩且信息量十足的 Steven 对话。几年前,Silicon Valley 曾有那么一个时刻:如果你去问大多数人,他们会说 neo clouds 会变成 commodity,尤其是因为 GPU compute 会被 commoditized(同质化、商品化)。
Speaker 201:35 - 01:49
And if you fast forward to today, it seems to be exactly the opposite, both Lambda, but several of your competitors seems to be absolutely ripping. So what is it that naysayers got wrong then and continue to get wrong today? Speaker 201:35 - 01:49
但如果把时间快进到今天,情况看起来恰恰相反:Lambda,以及你们的几家竞争对手,似乎都增长得极其迅猛。那么,当年的怀疑者到底误判了什么?而他们今天又还在继续误判什么?
Speaker 101:49 - 02:40
The big thing is that cloud compute is not a commodity service. It is a very complicated, highly vertically integrated type of service that spans everything from land entitlement, construction, HPC, high performance computing design, software, virtualization, cloud services on top. And there's a reason why the biggest companies in the world, these multitrillion dollar market cap businesses, whether it's Amazon, Microsoft, Google, Oracle, are all in the cloud computing business. It's because it's a great business. And so I think that's, like probably the fundamental thing that was misunderstood is that, oh, this is somehow a little bit different than a normal cloud service. Speaker 101:49 - 02:40
最关键的一点是,cloud compute(云计算算力)不是一种 commodity service(同质化服务)。它是一种非常复杂、高度垂直整合的服务类型,涵盖从土地审批、建设施工、HPC(high performance computing,高性能计算)设计,到软件、virtualization(虚拟化),再到其上的 cloud services(云服务)等各个层面。世界上最大的那些公司——这些市值数万亿美元的企业,不管是 Amazon、Microsoft、Google 还是 Oracle——都在做 cloud computing(云计算)业务,这不是没有原因的。原因就在于,这是一门很好的生意。所以我觉得,最根本的误解大概就是:人们以为这 somehow(不知怎的)和普通云服务有点不一样。
Speaker 102:40 - 02:45
But really what it was was it's a cloud service designed for the age of AI. Speaker 102:40 - 02:45
但实际上,它本质上就是一种为 AI 时代设计的 cloud service(云服务)。
Speaker 202:45 - 02:57
But there is some element of, commutization. Right? The the price of rental of a GPU is going down. But, what you're saying is that, to some extent, it doesn't matter because it's only one layer of the cake. Speaker 202:45 - 02:57
但这里面确实有某种 commutization(应为 commoditization,商品化/同质化)的因素,对吧?GPU 的租赁价格确实在下降。但你的意思是,在某种程度上这并不重要,因为它只是整块蛋糕中的一层。
Speaker 102:57 - 03:43
Yeah. So when you look at, for example, I I think it's, like, actually worth doing is to try to, like, kind of dig into some of the methodology on, for example, an index like there's there's the there's the index that's on Bloomberg for h 100 rental prices. And what we're actually seeing in the market is that, first of all, there's two different rates. There's a public cloud on demand rate, and then there's a long term rental rate. And I think that some of these in in indices don't properly take that into account, because what we're actually seeing is a very consistent, if not increasing long term rental rate and very consistent and increasing on demand rental rates. Speaker 102:57 - 03:43
对。所以比如说,当你去看——我觉得其实值得做的一件事,是去稍微深挖一下某些方法论;例如,像 Bloomberg 上那个 h 100 租赁价格指数。我们在市场上实际看到的是,首先,价格其实分成两种:一种是 public cloud(公有云)的 on demand rate(按需价格),另一种是 long term rental rate(长期租赁价格)。而我认为,其中一些指数并没有正确把这一点考虑进去,因为我们实际看到的是:长期租赁价格非常稳定,甚至还在上涨;而按需租赁价格也同样非常稳定,并且在上涨。
Speaker 103:43 - 04:02
And so what happens is if if if the index mix, for example, if the methodology in the index biases towards long term contracts being a bigger part of the volume, that will look like a decline in the index when the reality is it's just a decline in the mix that the index is you know, the index is covering. Speaker 103:43 - 04:02
所以实际发生的是,比如说,如果 index 的构成方式——也就是它的方法论——偏向让长期合同在成交量中占更大比重,那么这看起来就会像是 index 在下降;但现实中,这其实只是 index 所覆盖的那部分结构占比下降了。
Speaker 204:02 - 04:24
Fascinating. So I'm curious about your thoughts as a key leading player in the Neo Cloud ecosystem about how you see the market evolve. How much of the competitive advantage that you guys are building and other players are building is based on technology versus a financing race? Speaker 204:02 - 04:24
很有意思。所以我很好奇,作为 Neo Cloud 生态中的关键领先参与者,你怎么看这个市场的演化。你们以及其他参与者正在构建的竞争优势,到底有多大程度是基于技术,又有多大程度是一场融资竞赛?
Speaker 104:24 - 05:21
There's a few different layers on it, which is there's a lot of differentiation and work that's being put in into, for example, the cloud software orchestration layer, which allows us to, for example, take a very large scale GPU cluster and partition it up for our customers. So we've got, for example, our one click cluster product that allows us to do that. And that's something that's, like, quite unique in the Neo Cloud space. Most of the other Neo Clouds either don't have the ability to launch a cluster from their website or max it out at, let's say, 32 GPUs, whereas Lambda's designed a piece of software that allows us to give you anywhere from 16 up to, you know, 4,000 GPUs in a web interface. And then there's innovation on the data center construction and design side of things, which is also really important, right, because that's like the physical layer underneath the high performance computing equipment. Speaker 104:24 - 05:21
这件事有几个不同层面。首先,在很多差异化方向上都投入了大量工作。比如 cloud 软件的 orchestration layer(编排层),它让我们能够把一个超大规模的 GPU cluster(集群)切分给客户使用。比如,我们有 one click cluster 这个产品,就能做到这一点。这在 Neo Cloud 领域里算是相当独特的能力。大多数其他 Neo Clouds 要么无法直接从网站上启动一个 cluster,要么上限大概只有 32 个 GPU;而 Lambda 设计了一套软件,使我们可以通过 web interface(网页界面)向你提供从 16 个到 4,000 个 GPU 的配置。除此之外,在 data center(数据中心)建设和设计方面也有创新,这同样非常重要,因为那相当于高性能计算设备之下的物理层。
Speaker 105:21 - 05:56
And, you know, we're working on a lot of different ways to dramatically reduce the time it takes to construct and stand up new megawatts. And then there's, as you mentioned, innovation on the finance side of things where, you know, we're coming up with new and unique ways to finance, underwrite, package these these these large scale capital projects, really. And so I think it's like innovation's happening on every layer of the stack, and it's a it's a very complex coordination style business. Speaker 105:21 - 05:56
而且,我们也在研究很多不同的方法,来大幅缩短新 megawatt(兆瓦)容量从建设到上线所需的时间。然后,正如你提到的,在金融层面也有创新;我们正在想出一些新的、独特的方式,来为这些大规模资本项目进行融资、承保和打包。所以我认为,创新正在整个 stack(技术栈)的每一层发生,而这是一种协调极其复杂的业务。
Speaker 205:57 - 06:06
Yeah. And do you think that, ultimately, the Neo Cloud ecosystem becomes a a winner take all, or there's room for multiple very large players? Speaker 205:57 - 06:06
对。那你觉得最终 Neo Cloud 生态会变成 winner take all(赢家通吃)的格局吗,还是说会有多个体量非常大的参与者共存的空间?
Speaker 106:06 - 06:48
No. I I think it I think it's absolutely room for multiple very large players just like the traditional cloud business has shown that there's room for multiple large winners and multiple large players. And I think that the fundamental reason for that, kind of going back to what drives, I guess, market structure. And I'd say, generally speaking, when you have an industry that is has, like, technology moats and capital formation moats and economic moats, that tends to be oligopolistic in its market structure. When you have markets that have more sort of network effect moats, those tend to be a little bit more, you know, single winner take all. Speaker 106:06 - 06:48
不会。我认为绝对有空间容纳多个非常大的参与者,就像传统 cloud 业务已经表明的那样,这个行业可以同时存在多个大赢家和多个大型玩家。我觉得根本原因还是回到驱动 market structure(市场结构)的因素。一般来说,当一个行业同时具备 technology moats(技术护城河)、capital formation moats(资本形成护城河)和 economic moats(经济护城河)时,它的市场结构往往会呈现 oligopolistic(寡头垄断)特征。而那些更依赖 network effect moats(网络效应护城河)的市场,则往往更容易走向某种单一赢家通吃的局面。
Speaker 206:48 - 07:00
What are the various scenarios in your head as you think about the the future, about how it all play out? Are we overbuilding? Are we underbuilding? Nobody knows. How do you think about it? Speaker 206:48 - 07:00
当你思考未来、思考这一切将如何展开时,你脑中有哪些不同情景?我们是在过度建设吗?还是建设不足?没人真正知道。你是怎么思考这个问题的?
Speaker 107:00 - 08:06
Well, I think that we continue to be generally underbuilding. Most And people that are sort of in leadership positions at NeoClouds or within the market have been recognizing this sort of insatiable amount of demand for large language models to do everything from being an assistant to code generation, you can kind of look back to some of the talks that I've given in the past around I kind of called, hey. In a couple of, you know, months to years, we're gonna be at a point in time where you can put money in and get software out the other end. And now at the at that point in time when you're predicting when I was predicting that, it was maybe not why not as widely held of a belief. But now with, you know, let's say, the release of Opus four or five, I think it's pretty clear that we have an amazing system that can take in money and output software. Speaker 107:00 - 08:06
我认为总体来说,我们仍然是在建设不足。大多数 NeoClouds 的领导者,或者说市场中的核心参与者,都已经意识到,对 large language models(大语言模型)的需求是某种近乎无止境的状态——从做 assistant(助手)到 code generation(代码生成),几乎无所不包。你甚至可以回看我过去的一些演讲,我当时大意是说:再过几个月到几年,我们会来到这样一个时间点——你把钱投进去,另一端就能产出软件。而在我当时做出这个预测的时候,这还不是一个被广泛接受的观点。但现在,随着比如 Opus four 或 five 的发布,我觉得已经很清楚了:我们拥有了一个惊人的系统,可以吸收资金,然后输出软件。
Speaker 108:06 - 08:59
And the I think the the part which makes me feel so confident that there's gonna continue to be demand is that we continue to see no end to the scaling laws, which are like the underlying idea that you put more compute in and you get better intelligence levels out of your models. You know, as you increase the capacity of the model and train it with more compute, train it with more data, you get more intelligence out. And as long as that continues to hold, I think that we still have in store for us, it's hard to predict exactly when scaling laws might start to reach sort of a diminishing marginal return type of part of the curve. But right now, it's very clear that we're gonna continue to see more and more and more capable models that is kind of expanding the cone of the addressable market. Right? Speaker 108:06 - 08:59
我觉得,之所以我对需求会继续存在这件事这么有信心,一个关键原因是,我们仍然看不到 scaling laws(规模定律)的尽头。它的底层逻辑有点像:你投入更多 compute(算力),模型就会产出更高水平的 intelligence(智能)。你知道的,随着模型容量变大、训练时使用更多 compute、喂入更多数据,你就会得到更多智能输出。只要这个规律还成立,我就认为未来仍然还有很多空间。很难准确预测 scaling laws 什么时候会开始进入那种边际回报递减的曲线区间,但就目前来看,非常清楚的一点是,我们会继续看到能力越来越强的模型,这实际上是在不断扩大可服务市场的锥形范围。对吧?
Speaker 108:59 - 09:24
Like, originally, the cone of the addressable market was, alright. This is gonna be helpful for customer support. It's a sort of substitute good for Google search and for other search online. And then now it's like, well, this is a substitute for a lot of software engineering roles or a huge augment to software engineering roles. And so as that cone expands, the total market and the demand for compute expands. Speaker 108:59 - 09:24
比如最开始的时候,可服务市场的那个锥形范围还只是:好,这东西会对 customer support(客户支持)有帮助;它算是 Google search 以及其他在线搜索的一种替代品。而现在情况变成了:它可以替代很多 software engineering(软件工程)岗位,或者说对软件工程岗位形成巨大的增强。因此,随着这个锥形范围扩大,整体市场以及对 compute(算力)的需求也会扩大。
Speaker 109:24 - 09:26
And I think that we're continuing to underestimate it. Speaker 109:24 - 09:26
而且我觉得,我们还在持续低估这件事。
Speaker 209:26 - 09:33
Do you worry about model training and model inference becoming, I don't know, 10 x more compute efficient and what that would mean in terms of the buildup? Speaker 209:26 - 09:33
你会担心 model training(模型训练)和 model inference(模型推理)变得——我不知道——高效 10 倍吗?以及那对当前这种建设投入会意味着什么?
Speaker 109:33 - 10:08
I think that generally speaking, what you're seeing is that if if let's say you do become 10 times more efficient. I think that that just means that everybody is able to process 10 times more tokens, and there's there's still the same fixed amount of compute in the world at any given point in time. And so in the early days, it's funny. We used to talk a lot about this back in, let's say, 2017. Oh, well, maybe there's gonna be some new type of model, let's say, that will look more like a random forest model, which the audience might some members of the audience might know you can kind of train a random forest model on a MacBook. Speaker 109:33 - 10:08
我觉得一般来说,你看到的情况是:如果——假设——效率真的提高了 10 倍,我认为那只是意味着每个人都能处理 10 倍更多的 tokens(token),而在任何一个给定时点,世界上的 compute(算力)总量依然是固定的。所以早些年,这很有意思,我们以前经常讨论这个,差不多是在 2017 年吧。大家会想,哦,也许会出现某种新型模型,比如说,它会更像 random forest model(随机森林模型);一些听众可能知道,你基本上可以在一台 MacBook 上训练 random forest model。
Speaker 110:08 - 10:33
Right? And there was this concern that was kind of persistently raised around like, well, okay. What happens if you have this sort of, like, adjacent disruption on the model side of things? And so far, we haven't seen that. And, again, everything that we're building towards is sort of based on these scaling laws, which is really about scaling up this architecture. Speaker 110:08 - 10:33
对吧?当时一直有人反复提出一种担忧,大意是:好吧,如果模型这一侧出现某种相邻式的颠覆,会怎么样?但到目前为止,我们还没有看到这种情况。再说一次,我们正在构建的一切,基本上都是建立在这些 scaling laws(规模定律)之上的,而这实际上就是在把这种架构不断做大。
Speaker 110:33 - 10:44
So I don't really foresee a very likely outcome where we have this huge model disruption that would cause a decline in the demand for compute. Speaker 110:33 - 10:44
所以我并不认为一种很可能发生的结果会是:出现某种巨大的模型层面颠覆,进而导致对 compute(算力)的需求下降。
Speaker 210:44 - 10:53
Where's the main bottleneck these days that you're experiencing building Lambda Labs? Is that GPU power, electricity? Speaker 210:44 - 10:53
你们现在在建设 Lambda Labs 的过程中,主要瓶颈是什么?是 GPU 算力,还是电力?
Speaker 110:54 - 11:38
So I always say that bottlenecks are always, kind of local before they're global in terms of one development might be bottlenecked on, let's say, generators or on UPS systems. That's a function of the sort of idiosyncrasies of the site. But broadly in the industry, the thing that is the main bottleneck is basically land powered shell, which is basically land that is entitled to have a certain amount of megawatt commitment from a utility And then, of course, the data center and the mechanical, electrical, and plumbing equipment, the MEP equipment that goes into that data center. And so that's the main bottleneck that we're seeing in the industry right now, I'd say, across the board. Speaker 110:54 - 11:38
所以我总是说,瓶颈往往先是局部的,然后才是全局的。比如某个开发项目可能会卡在 generators(发电机)或 UPS systems(不间断电源系统)上,这取决于 site(场址)本身的一些特殊性。但从整个行业来看,主要瓶颈基本上是 land powered shell,也就是已经取得许可、并且 utility(公用事业公司)承诺提供一定 megawatt(兆瓦)电力容量的土地。当然,还包括 data center(数据中心)本身,以及装进数据中心里的 mechanical, electrical, and plumbing equipment,也就是 MEP equipment(机电管线设备)。所以我会说,这是我们现在在整个行业里普遍看到的主要瓶颈。
Speaker 211:38 - 11:47
How real is the movement against data centers from the the the global community, and how do you think about, how to respond to it? Speaker 211:38 - 11:47
全球社区里反对 data centers(数据中心)的这股动向到底有多真实?你又怎么看待该如何回应它?
Speaker 111:47 - 12:23
Well, it's certainly, it's, like, very popular in the news right now. I'd say that, it's definitely very real. I mean, I think that rightfully, communities that host any type of large capital project, whether it's a power plant or a solar farm or a data center or a distribution center, right, those communities want to have a seat at the table. I'd say in general, though, I spend a lot of time reading through a lot of the comments from communities. And people want jobs. Speaker 111:47 - 12:23
嗯,这当然是现在新闻里非常热门的话题。我会说,这绝对是非常真实的。我的意思是,我认为任何承载大型资本项目的社区——无论是 power plant(发电厂)、solar farm(光伏电站)、data center(数据中心)还是 distribution center(配送中心)——都理应在讨论中拥有一席之地。不过总的来说,我花了很多时间去看社区里的大量评论,而人们是想要 jobs(工作岗位)的。
Speaker 112:23 - 12:51
They want tax revenue. Any major capital development is gonna bring a lot of tax revenue, and it's gonna bring a lot of jobs. And it's gonna bring investment into their community. And what they really are voicing, I think, is, one, is having a seat at the table while while this stuff is, you know, being developed. I think that that's an important thing, just to have their voices heard and that that the developers coming in and actually understanding the community. Speaker 112:23 - 12:51
他们想要 tax revenue(税收收入)。任何重大的资本开发都会带来大量税收,也会带来很多工作岗位,还会给他们的社区带来投资。而我认为他们真正表达的,第一,是在这些项目开发过程中拥有一席之地。我觉得这很重要:让他们的声音被听见,也让进入当地的 developers(开发商)真正理解这个社区。
Speaker 112:51 - 13:34
The other thing to kind of, I think, keep in mind is that there's a lot of misinformation out there. So for example, every single modern deployment of, let's say, a Blackwell class or a Rubin class GPU, you know, the VR GBN VR GPUs, These are oftentimes in a closed direct to chip liquid cooling system that's connected to a dry cooler, which means that there's almost zero evaporation. It's not using evaporative cooling. It's using a dry cooler system that does not consume a lot of water. And on top of that, most of these data center developments are bringing a ton of power to the grid. Speaker 112:51 - 13:34
另一个我认为需要记住的是,外面存在很多 misinformation(错误信息)。举例来说,现在每一种现代化部署——比如说 Blackwell class 或 Rubin class GPU,像 NVL、GBN、VR GPUs——很多时候都采用 closed direct-to-chip liquid cooling system(封闭式直达芯片液冷系统),并连接到 dry cooler(干式冷却器),这意味着几乎没有 evaporation(蒸发)。它不是在用 evaporative cooling(蒸发冷却),而是在用 dry cooler system(干冷系统),这种系统并不会消耗很多水。此外,这些数据中心开发项目中的大多数,还在给 grid(电网)带来大量电力相关资源。
Speaker 113:34 - 14:23
They're either standing up behind the meter power. They're standing up and bringing battery electric storage systems to the grid, and they're bringing all these, like, sort of ancillary benefits that strengthen and fortify the grid and also, you know, eventually in the long term will maintain the costs that are being experienced by the community. And so I actually think that there's a very clear path towards maybe spreading more of that the facts around what does a data center bring, because there's just a lot of misinformation. You'll see people talking about how data centers consume a lot of water. Well, an evaporative cooling tower might evaporate a lot of water, but practically no new builds in The United States are using evaporative cooling for doing the these closed loop direct to chip liquid cooling systems. Speaker 113:34 - 14:23
它们要么在建设 behind-the-meter power(表后电源),要么在部署并向 grid(电网)接入 battery electric storage systems(电池储能系统),还会带来各种这类 ancillary benefits(附带收益),这些都能增强和加固电网,而且从长期来看,也会帮助维持社区正在承受的成本水平。所以我实际上认为,围绕“数据中心到底带来了什么”去更广泛传播事实,是有一条非常清晰路径的,因为现在确实有很多 misinformation(错误信息)。你会看到有人说数据中心耗水很多。没错,evaporative cooling tower(蒸发冷却塔)可能会蒸发大量水,但实际上,美国几乎没有新的建设项目在这些 closed-loop direct-to-chip liquid cooling systems(闭环直达芯片液冷系统)中使用 evaporative cooling(蒸发冷却)。
Speaker 214:23 - 14:39
Do you think we do a terrible job, as an industry explaining this to the broader world? Because, like, those things keep coming back, and they seem to be accelerating. But then when you have the discussion, this from a technical standpoint, a lot of it is just simply based on misinformation, as you just said? Speaker 214:23 - 14:39
你觉得我们这个行业在向更广泛的世界解释这些事情时做得很糟吗?因为这些说法总是一再出现,而且似乎还在加速传播。但当你从技术角度去讨论时,正如你刚才说的,其中很多其实只是基于 misinformation(错误信息)?
Speaker 114:39 - 15:00
I think that everybody's trying to get better at that kind of communication. And it just takes some clear thinking, writing down what are the benefits, writing down what are the costs, and presenting that clearly and plainly to a community so they can make a good decision about what kind of jobs and what kind of development they want in their communities. Speaker 114:39 - 15:00
我觉得大家都在努力把这类沟通做得更好。这需要清晰地思考,把 benefits(收益)写清楚,把 costs(成本)写清楚,然后以清楚、直白的方式呈现给社区,这样他们才能就自己的社区想要什么样的 jobs(工作岗位)和什么样的 development(开发)做出好的决定。
Speaker 215:00 - 15:13
Let's open the hood for a minute. People talk about things like flops and GPU hours and tokens and MFU. What what is the the best way to think about a a compute unit? Speaker 215:00 - 15:13
我们先稍微打开引擎盖来看一看。人们会谈论 flops、GPU hours、tokens 和 MFU 之类的东西。那么,理解 compute unit(计算单位)的最佳方式到底是什么?
Speaker 115:13 - 15:40
Yeah. It's interesting. You know, you said a few different terms, and I always like to kind of break it down from a physics perspective into the SI terms. So, Okay, on the left hand side is all of the energy production, and then on my right hand side is tokens being consumed by by somebody. And maybe you can even have the application layer on on the far right of that that's using the token. Speaker 115:13 - 15:40
对,这很有意思。你刚才提到了几个不同的术语,而我一直喜欢把它们从 physics(物理学)的角度,拆解成 SI(国际单位制)术语来看。所以,左边是所有的 energy production(能量生产),右边则是某个人正在消耗的 tokens。甚至在更右边,你还可以放上 application layer(应用层),它会使用这些 token。
Speaker 115:40 - 16:22
So on the left hand side, you've got either photons coming in per second or molecules of natural gas coming in per second. And then that through a power plant or a solar farm gets converted into joules per second, which is a measure of electrical power production. And then the joules per second, obviously, in engines, there's a level of an efficiency, and that's engine efficiency. It's interesting because, like, the MFU percentage is kind of like an inefficiency up on the higher end of that chain. The power plant or the solar plant then converts that into joules per second, which is watts, which is consumed by the entire data center. Speaker 115:40 - 16:22
所以在左边,你有每秒进入的 photons(光子),或者每秒进入的 natural gas(天然气)分子。然后这些通过 power plant(发电厂)或 solar farm(太阳能电站)被转换成 joules per second(焦耳每秒),这就是电力生产的度量。接着,joules per second,显然,在各种 engine(引擎)里都会有一个效率水平,那就是 engine efficiency(引擎效率)。这点很有意思,因为 MFU 百分比有点像这条链条更高层环节上的一种低效。然后 power plant 或 solar plant 会把它转换成 joules per second,也就是 watts(瓦特),再由整个 data center(数据中心)来消耗。
Speaker 116:22 - 16:45
The data center itself, you know, needs to cool itself, and that's the PUE. And that's actually the the efficiency metric that you can use to measure a data center on. And then you put the servers and all the different networking and storage gear in, and that's producing floating point operations per second or FLOPS per second. Okay? That is what gets consumed. Speaker 116:22 - 16:45
数据中心本身还需要给自己散热,这就是 PUE。实际上,这才是你用来衡量一个数据中心的 efficiency metric(效率指标)。然后你再把 servers(服务器)、各种 networking(网络)设备和 storage gear(存储设备)放进去,它们就会产出每秒 floating point operations(浮点运算),也就是每秒 FLOPS。好吧?这才是被消耗的东西。
Speaker 116:45 - 17:11
The flops per second capacity is what gets consumed by, let's say, a model builder when they're training a model or when they're inferencing a model. And that gets turned from flops per second into the tokens per second. Then on top of that tokens per second, you might have some level of efficiency that the end customer is actually, you know, turning those tokens into real actual intelligence. That's like the entire pipeline, I I would say, from end to end. Speaker 116:45 - 17:11
这个每秒 flops 的容量,会被比如说 model builder(模型构建者)在训练模型或做模型 inferencing(推理)时消耗掉。然后它会从 flops per second 转换成 tokens per second。再往上,在这个 tokens per second 之上,最终客户还可能存在某种效率水平,也就是他们是否真的把这些 token 转化成了真实、实际的 intelligence(智能)。我会说,这就是从头到尾的整条 pipeline(流程)。
Speaker 217:11 - 17:23
That's super helpful. If two companies have the same chip fundamentally, how do they extract more value from it? What what needs to happen to maximize the usefulness of that chip? Speaker 217:11 - 17:23
这非常有帮助。如果两家公司从根本上拥有相同的 chip(芯片),它们如何从中榨取更多价值?要让这颗芯片的 usefulness(实用性)最大化,需要发生什么?
Speaker 117:23 - 18:10
If you look at the cost structure of, let's say, one GPU hour of time, you know, we we're talking about h one hundreds, the the largest part of that cost structure is the depreciation that is associated with that GPU hour. And, basically, you can think of a utilization metric as being, like, kind of a multiplicative factor on that. So one over the utilization. So if you if you use your capital asset 50% of the time, you will have on a per hour basis twice, one over 0.5, the amount of per hour depreciation expense associated with that. And so I think that the number one way that companies are, you know, sort of gaining unique advantages. Speaker 117:23 - 18:10
如果你去看,比如说一小时 GPU 的成本结构——我们这里说的是 H100——这个成本结构里最大的一部分是与这一个 GPU 小时相关的 depreciation(折旧)。基本上,你可以把 utilization metric(利用率指标)看成是作用在这上面的一个乘数因子,也就是 one over the utilization(利用率的倒数)。所以如果你的 capital asset(资本资产)只有 50% 的时间在被使用,那么按每小时来算,与之相关的每小时折旧费用就会变成两倍,也就是 one over 0.5。因此我认为,公司获得独特优势的首要方式就是这一点。
Speaker 118:10 - 19:00
Well, how can I build a cloud product that is beloved by people that is gonna drive a high utilization? And in addition to that, the market, as we mentioned earlier, for on demand compute, basically, the retail pricing is obviously much higher than the wholesale pricing. So the retail is on demand, spin up a GPU, spin down a GPU, normal cloud service. The wholesale is buying 10,000 GPUs for five years, for example. And so one of the things that we do at Lambda is really try to figure out, hey, how can we sort of get the most dollar utilization and percentage utilization out of the capital deployments that we do? Speaker 118:10 - 19:00
也就是说,我怎样才能打造一个大家喜爱的 cloud product(云产品),从而带来高 utilization(高利用率)?除此之外,正如我们前面提到的,on demand compute(按需计算)这个市场里,retail pricing(零售定价)显然远高于 wholesale pricing(批发定价)。零售就是按需的:启动一个 GPU,再关闭一个 GPU,标准的 cloud service(云服务)。批发则是比如一次性买 10,000 个 GPU,用五年。所以我们在 Lambda 做的一件事,就是努力弄清楚:我们怎样才能从这些资本投入中,获得尽可能高的 dollar utilization(美元利用率)和 percentage utilization(百分比利用率)?
Speaker 119:00 - 19:20
And that's that's by making great cloud software that makes it easy for somebody to spin it up and down. So for example, if you don't have that cloud software, you can't rent. You can't extract a retail pricing. Right? You know, you cannot rent it out to somebody for an hour because you just simply don't have the means be able to do that. Speaker 119:00 - 19:20
而这要通过打造优秀的 cloud software(云软件)来实现,让人们可以很容易地把资源启用和停用。比如说,如果你没有这套 cloud software,你就没法出租。你就没法提取出零售定价,对吧?你也不可能把它按小时租给别人,因为你根本没有实现这件事的手段。
Speaker 119:20 - 19:25
And, actually, a lot of Neo Clouds are in that position where they they don't even have the infrastructure to be able to run a real cloud service. Speaker 119:20 - 19:25
实际上,很多 Neo Clouds 都处在这种位置:它们甚至连运行真正 cloud service(云服务)所需的基础设施都没有。
Speaker 219:25 - 19:37
So you have GPUs, but, like, a a big part of how those data centers work is transforming GPUs into networks of GPUs. Do you wanna explain at a high level how that works? Speaker 219:25 - 19:37
所以你有 GPUs,但这些 data center(数据中心)之所以能运作,一个很大的部分在于把 GPUs 变成 GPU 网络。你要不要从高层次解释一下这是怎么实现的?
Speaker 119:37 - 20:12
The general idea is that you've got a large scale high performance computing cluster of a bunch of, you know, let's say, NVIDIA GB 300 NVL 72 racks. That's 72 GPUs all networked together via NVLink. And then there's a connection between the racks that's either InfiniBand or, you know, high speed Ethernet. And that is a essentially what's called a spine leaf topology, which is basically a way to say, hey. This is a completely non blocking. Speaker 119:37 - 20:12
大致思路是,你有一个大规模 high performance computing cluster(高性能计算集群),由很多——比如说——NVIDIA GB 300 NVL 72 racks 组成。那就是 72 个通过 NVLink 连接在一起的 GPUs。然后 rack 与 rack 之间还有连接,通常是 InfiniBand,或者高速 Ethernet。其本质上就是所谓的 spine leaf topology,也基本可以理解为:这是一个完全 non-blocking(无阻塞)的网络。
Speaker 120:12 - 20:49
Every port on every GPU can talk with every other GPU in the network. It's fully connected, and it's able to provide maximum bandwidth between every individual GPU. And that cluster is useful for training large models. It's also useful for inferencing. So frontier inference, as we sometimes refer to it at Lambda, is basically, you know, very much a distributed inferencing problem where they actually will, you know, fragment or shard the model. Speaker 120:12 - 20:49
网络中每个 GPU 上的每个端口,都可以和其他任何 GPU 通信。它是 fully connected(全连接)的,并且能够在任意两块 GPU 之间提供最大带宽。这样的 cluster 对训练大型模型很有用,对 inferencing(推理)也很有用。所以我们在 Lambda 有时说的 frontier inference,基本上就是一种高度分布式的 inferencing 问题,在这种情况下,他们实际上会把模型切分或 shard(分片)。
Speaker 120:49 - 21:02
There'll be some sort of sharding strategy for the model, where it can be, essentially run on multiple GPUs, and it uses that high speed InfiniBand or Ethernet interconnect to to do that communication. Speaker 120:49 - 21:02
他们会为模型设计某种 sharding strategy(分片策略),使它本质上能够运行在多块 GPU 上,并利用高速的 InfiniBand 或 Ethernet 互连来完成这些通信。
Speaker 221:03 - 21:10
And so what is frontier inference? Is that inference for the most advanced reasoning models, like the more demanding Speaker 221:03 - 21:10
那么,什么是 frontier inference?它指的是对最先进的 reasoning models(推理模型)进行 inference,也就是那种要求更高的
Speaker 121:11 - 21:33
jobs? Well, you know, it's not necessarily associated with reasoning models so much as, like, just a very large frontier model that is, you know, kind of the domain of, let's say, three companies in the world or four companies in the world. When they're doing their inference, it's a very complicated thing that is is fully utilizing all of the interconnection that's available. Speaker 121:11 - 21:33
任务吗?嗯,它不一定主要是和 reasoning models 相关,更准确地说,是那种非常大的 frontier model(前沿模型),大概属于全球三四家公司才涉及的领域。当它们做 inference 时,这是一个非常复杂的过程,会把所有可用的互连能力都充分利用起来。
Speaker 221:33 - 21:48
And what you described for Frontier in France, is that, conceptually the same thing as what happens for training, this concept of just distributing a task massively across a bunch of GPUs? What happens during a training run from a compute standpoint? Speaker 221:33 - 21:48
你刚才描述的 France 的 Frontier,从概念上说,是否和训练时发生的事情是一样的——也就是把一个任务大规模分布到一大批 GPU 上?从 compute(算力)的角度看,一次训练 run 过程中到底会发生什么?
Speaker 121:48 - 22:36
Generally speaking, when you're doing a training run, you might think there might be some sort of split between the backwards pass and the forward pass on the model. And the backwards pass might be, let's say, two thirds or more of the compute. And the forward pass, which is basically the same thing as inferencing, is the remainder. And one of the realizations that I think has been made over the last bit of time is that the type of infrastructure that you'd want for doing a large scale training run can be reused to do the inferencing of that model. And what I mean by the frontier inference and the fact that the inferencing is being done in a distributed way, you'll have a mixture of experts model. Speaker 121:48 - 22:36
一般来说,在做一次训练 run 时,你可以把它理解为:模型的 backward pass(反向传播)和 forward pass(前向传播)之间会有某种拆分。backward pass 可能会占到,比如说,三分之二甚至更多的 compute;而 forward pass——它本质上和 inferencing(推理)是同一类事情——占剩下的部分。我认为过去这段时间里人们逐渐意识到的一点是:你为大规模训练 run 所搭建的那类基础设施,也可以被复用于这个模型的 inferencing。我说的 frontier inference,以及这种以分布式方式进行 inferencing 的事实,指的是你会有一个 mixture of experts model(专家混合模型)。
Speaker 122:36 - 23:08
And there'll be different, basically, starting strategies for how you put those experts onto different servers and to different GPUs. And, you know, the models can be very large. They may not fit on one single rack or, you know, they may not fit on one single server. They they might they might need to be distributed across different servers to even just do the forward inference pass. And so that's where distributed frontier inference comes into the picture. Speaker 122:36 - 23:08
然后会有不同的、基本上可以说是不同的起始策略,来决定如何把这些 experts 放到不同的 server 上、不同的 GPU 上。而且,你知道,模型可能会非常大。它们可能装不进一个单独的 rack(机架),也可能装不进一台单独的 server。它们甚至可能需要分布到不同的 server 上,才能仅仅完成 forward inference pass(前向推理)。所以,分布式的 frontier inference 就是在这里进入画面的。
Speaker 123:08 - 23:19
Because if you're doing a small model, let's say, LAMA, that the users might be familiar with or some of the quantized small models can fit on a single GPU. Speaker 123:08 - 23:19
因为如果你做的是一个小模型,比如用户可能熟悉的 LAMA,或者一些经过 quantized(量化)的较小模型,它们可以装进单张 GPU。
Speaker 223:19 - 23:19
Mhmm. Speaker 123:19 - 23:26
Mhmm. Okay. Well, let's just say that, like, Opus and ChatGPT 5.5 can't fit on a single GPU. Speaker 123:19 - 23:26
嗯,好吧。那我们就这么说吧,像 Opus 和 ChatGPT 5.5 这样的模型,是装不进单张 GPU 的。
Speaker 223:26 - 23:42
And when we think about compute costs, what what costs the most money? Is that model size? Is that memory bandwidth? Is that latency? Does context window and, like, those very, very large context window, do they change anything to the compute cost? Speaker 223:26 - 23:42
那当我们考虑 compute cost(算力成本)时,最花钱的到底是什么?是模型大小吗?是 memory bandwidth(内存带宽)吗?是 latency(延迟)吗?还有 context window(上下文窗口),比如那种非常非常大的 context window,会不会改变 compute cost?
Speaker 223:42 - 23:44
What what costs the most money? Speaker 223:42 - 23:44
到底什么最花钱?
Speaker 123:44 - 24:36
As I mentioned, like, the biggest component of the unit cost for a cloud service like this is the depreciation expense. And within that, you know, is basically some sort of bill of materials for the servers that are in the data center, which is by far and away the biggest portion of the cost. You talk about the capital stack, let's say, you can go back down to power generation, 2 to $3,000,000 a megawatt, 2 to $3,000,000,000 a gigawatt for a power plant. The data center is between 10 and $15,000,000,000 a gigawatt for building the data center. And then the compute the servers can be anywhere from 35 to $45,000,000,000 a gigawatt. Speaker 123:44 - 24:36
正如我提到的,这类 cloud service(云服务)的单位成本里,最大的组成部分是 depreciation expense(折旧费用)。而在这其中,基本上就是数据中心里那些 server 的 bill of materials(物料清单)成本,这毫无疑问是成本里最大的一块。假设你从 capital stack(资本结构)来往下拆,你可以一直拆到发电:每兆瓦是 200 万到 300 万美元,也就是一座发电厂每吉瓦是 20 亿到 30 亿美元。数据中心本身的建设成本,则是每吉瓦 100 亿到 150 亿美元。然后 compute,也就是 server,成本大约在每吉瓦 350 亿到 450 亿美元之间。
Speaker 124:36 - 25:21
And within that that so so you can see the server portion is obviously by far and away the largest, and that's like a big part of the depreciation expense. And then the within that, obviously, you have the sort of server and cluster bill of materials, which is primarily the GPUs. If you were to kind of break down NVIDIA's bill of materials, then, you know, you can kind of get better allocation towards where those costs are coming from. But certainly, in the most recent period of time, memory expenses you know, memory has gone up a lot in price. And, you know, there's there's very few vendors, right, you know, for HBM memory with Samsung, Hynix. Speaker 124:36 - 25:21
你可以看到,server 部分显然是远远最大的那一块,而那很大程度上对应 depreciation expense(折旧费用)。再往里拆,显然还有 server 和 cluster 的 bill of materials(物料清单),其中主要就是 GPUs。如果你去进一步拆解 NVIDIA 的 bill of materials,那么你就能更好地分配、判断这些成本究竟来自哪里。但至少在最近这段时间,memory 成本确实涨了很多,memory 的价格上涨非常明显。而且做 HBM memory 的 vendor 本来就很少,对吧,像 Samsung、Hynix。
Speaker 225:21 - 25:32
So you guys are a big NVIDIA shop at a precise level. You mentioned some of the names, but, like, which chips do you use mostly? What's your kind of chip stack? Speaker 225:21 - 25:32
所以你们在很大程度上算是一个重度使用 NVIDIA 的团队。你提到了一些名字,但具体来说,你们主要用哪些 chip?你们的 chip stack 大概是什么样的?
Speaker 125:33 - 26:10
Yeah. So Lambda really loves NVIDIA's products. I mean, they're the the only server provider, the only chip provider that is available in every single major cloud platform, which is a huge platform advantage. And we stuck with the NVIDIA sort of ecosystem for all of the chips we've deployed. And we've got everything from v one hundreds, a one hundreds, h one hundreds, h two hundreds, b 2 hundreds, g h 2 Hundreds or g v 200, B 3 Hundreds, and VR two hundreds coming soon. Speaker 125:33 - 26:10
对。Lambda 非常喜欢 NVIDIA 的产品。我的意思是,它们是唯一一家在每一个主要 cloud platform 上都能用到的 server provider、也是唯一的 chip provider,这是一种巨大的平台优势。我们部署的所有 chip 也都一直沿用 NVIDIA 的 ecosystem。我们手上什么都有,从 v one hundreds、a one hundreds、h one hundreds、h two hundreds、b 2 hundreds、g h 2 Hundreds 或 g v 200、B 3 Hundreds,到即将推出的 VR two hundreds。
Speaker 126:11 - 26:17
And so we, you know, use everything in the in the in the ecosystem. Speaker 126:11 - 26:17
所以我们基本上把这个 ecosystem 里的东西都在用。
Speaker 226:17 - 26:27
Do you think that today or in the near future, we're going to be in a multisilicon kind of world? Is there, like, room for different players beyond NVIDIA? Speaker 226:17 - 26:27
你觉得今天,或者在不久的将来,我们会进入一个 multisilicon 的世界吗?也就是说,除了 NVIDIA 之外,其他参与者还有空间吗?
Speaker 126:27 - 27:03
Well, I mean, I think that we're already in a world where there's a huge amount of competition from massive, massive, multi trillion dollar companies, and they're all trying to fight for the same thing, which is to be the best chip in the world for running and training neural networks, essentially. NVIDIA's built a great product that has gotten a lot of distribution and has a great platform of developers who love what what they do. And you have to take into account not just the cost of the chip. Right? The price of the chip is one aspect, but, you know, you have to take into account the entire software ecosystem and what's been developed. Speaker 126:27 - 27:03
嗯,我的意思是,我认为我们其实已经处在一个竞争极其激烈的世界里了,参与竞争的是一些规模巨大、巨大到市值数万亿美元的公司,而它们都在争夺同一件事:本质上,就是做出全球最适合运行和训练 neural networks(神经网络)的 chip。NVIDIA 打造了一个非常出色的产品,已经获得了广泛分发,并且拥有一个很强大的 developer 平台,开发者也非常喜欢他们做的东西。你必须考虑的不只是 chip 的成本,对吧?chip 的价格只是其中一个方面,但你还得把整个 software ecosystem,以及围绕它已经开发出来的东西都考虑进去。
Speaker 127:03 - 27:15
So one of the big people talk about, like, what's NVIDIA's moat? One of the big moats they've got is just the cuDNN stack. It's not just CUDA. It's CUDA is, sure. That's like the water we all swim in. Speaker 127:03 - 27:15
所以大家在谈 NVIDIA 的 moat(护城河)时,有一个很大的 moat 就是它的 cuDNN stack。不是只有 CUDA。CUDA 当然也是,那就像我们所有人赖以生存的水一样。
Speaker 127:15 - 27:22
But, like, CUDA NN has got so many matrix multiplication routine optimizations baked into it. Speaker 127:15 - 27:22
但像 CUDA NN 这样的东西,里面已经内建了非常多 matrix multiplication routine(矩阵乘法例程)的优化。
Speaker 227:22 - 27:24
What is CUDA NN for everyone to understand? Speaker 227:22 - 27:24
怎么用人人都能懂的话来解释 CUDA NN 是什么?
Speaker 127:24 - 28:02
Yeah. Okay. So CUDA NN is the it's CUDA deep neural network library, and it's basically NVIDIA's you can think of it like a highly tuned engine for matrix multiplication. And, basically, if you were to just sort of naively implement the matrix multiplication algorithm, you would maybe get a certain level of floating point out floating points per second. But they've gone and tuned every single aspect of it and come in and do Winograd filtering or a bunch of different algorithms that you would apply to speed up matrix multiplication. Speaker 127:24 - 28:02
对,好。所以 CUDA NN 其实就是 CUDA 的 deep neural network(深度神经网络)库,基本上你可以把它理解为 NVIDIA 的一个为矩阵乘法高度调优的引擎。简单说,如果你只是很朴素地自己实现矩阵乘法算法,可能只能达到某个浮点运算每秒的水平。但他们把其中每一个环节都做了调优,还会用到 Winograd filtering 之类的各种算法,来加速矩阵乘法。
Speaker 128:02 - 28:47
And cuDNN means that you don't have to go and do the optimization yourself. And so that's one aspect. The other one is Nickel NCCL, which is their networking optimization library where it will sense the topology and the connected nature of your network, your InfiniBand or your Ethernet network. And it will suggest an optimized sort of routine for doing, you know, reduce all and broadcast the different one, what are called OpenMPI primitives, which are used for that sharding that we were talking about for both training and for inference. And so that's, like, the kind of software stack that I think really is hard for a lot of the new entrants in the chip space to overcome. Speaker 128:02 - 28:47
而 cuDNN 的意义在于,你不用自己去做这些优化。这是一方面。另一方面是 Nickel NCCL,这是他们的网络优化库,它会感知你的网络拓扑以及连接方式,不管是 InfiniBand 还是 Ethernet 网络。然后它会为诸如 reduce all 和 broadcast 这些不同的操作,给出一种优化过的执行例程;这些操作属于所谓的 OpenMPI primitives,被用于我们刚才谈到的那种 sharding(分片),无论是在训练还是 inference(推理)中都会用到。所以,这一整套 software stack(软件栈),在我看来,确实是很多新进入 chip(芯片)领域的公司很难跨越的门槛。
Speaker 128:48 - 28:59
I think we're already like I said, we're already in a world where there are multiple options for silicon. You know? The biggest labs in the world are using multiple different types of chips to do their inferencing and training on. Speaker 128:48 - 28:59
我觉得我们其实已经——就像我说的——处在一个有多种 silicon(芯片)可选的世界里了。你知道,全球最大的那些实验室,已经在用多种不同类型的芯片来做 inferencing(推理)和训练。
Speaker 228:59 - 29:08
What would be a plain English definition? We talked about the chips, but, like, the rest of the stack, the networking and the storage, just walk us through how it works. Speaker 228:59 - 29:08
如果用大白话来定义,会怎么说?我们刚才聊了芯片,但像 stack(栈)的其他部分,比如 networking(网络)和 storage(存储),你给我们顺着讲讲它是怎么工作的。
Speaker 129:08 - 29:59
When you're running a cloud service, one of the things you know, you'll you'll train your model or you'll upload your trained model and use you're ready to start doing large scale inferencing. Well, you're gonna need a place to put your data, whether it's the data that you're using to train with or whether it's the data that's coming in and streaming in from your end customers. And so having high speed storage is a really important part of it. So Lambda offers the basically AI optimized file system service that is significantly faster than your standard, let's say, cloud file system, which is maybe more of a traditional NFS type of thing. This is a highly optimized parallel file system that's designed for high performance read and writes, mostly high performance reads. Speaker 129:08 - 29:59
当你在运行 cloud service(云服务)时,有一件事是这样的:你会训练自己的 model(模型),或者上传已经训练好的模型,然后就准备用它开始进行大规模 inferencing(推理)。这时候,你需要一个地方来存放数据,不管是你拿来训练的数据,还是来自终端客户、持续流入的数据。所以,高速 storage(存储)是其中非常重要的一部分。Lambda 提供的基本上是一种面向 AI 优化的 file system service(文件系统服务),它明显快于标准的 cloud file system(云文件系统);后者可能更像传统的 NFS 类型方案。而这个则是一个高度优化的 parallel file system(并行文件系统),专为高性能读写而设计,不过主要还是高性能读取。
Speaker 129:59 - 30:01
That's kind of most of the workload. Speaker 129:59 - 30:01
这基本上就是大部分 workload(工作负载)。
Speaker 230:01 - 30:04
And that's something you build in house completely? Speaker 230:01 - 30:04
这个东西是你们完全内部自研构建的吗?
Speaker 130:04 - 30:14
We have, I mean, it was in house completely. Right? It's like, have to ask the question, like, what what is the definition of in house completely? Right? You know, like, we've never spun a PCB at this company. Speaker 130:04 - 30:14
我们的确,我是说,它完全是在 in house(内部自研)完成的。对吧?但这就得先问一个问题:到底怎么定义“完全 in house”呢?对吧?你知道,我们这家公司从来没有自己做过 PCB。
Speaker 130:15 - 30:56
We have not authored, for example, you know, we use KBMKymu for our virtualization, for example. Right? And so, you know, we we have both commodity off the shelf hardware that has software installed on top of it for our some of our storage. We have some storage partners that we work with as well. But, you know, generally speaking, everything that we do on the cloud, I would generally say, is is something that is, like, we we rolled it ourselves with the help of the broader ecosystem because, again, there's no such thing as rolling it yourself unless you're, like, you know, mining, you know, ultra pure silicon from somewhere. Speaker 130:15 - 30:56
比如说,我们并没有自己去写所有东西,你知道的,我们的 virtualization(虚拟化)用的是 KBMKymu。对吧?所以,你知道,我们有一些 commodity off the shelf hardware(现成商用硬件),在上面安装软件后用于我们的部分存储。我们也有一些合作的存储伙伴。不过,总体来说,我们在 cloud(云)上做的所有事情,我通常都会说,基本上算是我们自己做出来的,但也借助了更广泛的 ecosystem(生态系统)的帮助;因为再说一次,除非你连 ultra pure silicon(超高纯度硅)都要自己从某个地方开采,否则根本不存在真正意义上的“全都自己造”。
Speaker 130:56 - 31:02
Like, you know, and then coming up with your own ASML. You know? It's like it's it's funny. Speaker 130:56 - 31:02
然后还得再搞出你自己的 ASML。你知道吧?这就挺好笑的。
Speaker 231:02 - 31:10
Yeah. Yeah. That's the highly optimized storage. What else? The networking part and what other pieces? Speaker 231:02 - 31:10
对,对。那就是高度优化的存储。还有什么?networking(网络)这部分呢?还有哪些其他组件?
Speaker 131:10 - 31:25
So so I I I was talking about this OneClub cluster product that we've got. And the the way just for everybody to think about this is like, okay. Well, look. You've got a bunch of GPUs. Let's say you've got a cluster of 10,000 GPUs. Speaker 131:10 - 31:25
所以,我刚才说的是我们这个 OneClub cluster 产品。为了让大家更容易理解,可以这么想:好,你手上有一大堆 GPU。假设你有一个由 10,000 个 GPU 组成的 cluster(集群)。
Speaker 131:26 - 31:50
Well, I wanna partition that cluster up. And so what it is is it's a bunch of GPUs. It's some CPU servers as well, because you need to have an orchestration fleet as well. And then you've got some storage. And all of the CPU servers and the storage servers and the GPU servers are interconnected with storage so they can quickly read and write from it. Speaker 131:26 - 31:50
那我想把这个 cluster 划分开。所以它本质上是一堆 GPU,也会有一些 CPU server(CPU 服务器),因为你还需要一套 orchestration fleet(编排控制服务器群)。然后你还有一些 storage(存储)。所有这些 CPU 服务器、存储服务器和 GPU 服务器,都会通过存储互联起来,这样它们就能快速地进行读写。
Speaker 131:52 - 32:29
There's and that communication happens over what's called the in band network. And then there's the compute fabric, which is where I was talking about where all of the sort of weights and feature activations are being shared throughout that compute fabric. And then there's an out of band monitoring network where where you've got access to whether it's BMC or some of your DPUs. And when you are trying to create a sub partition of a 10,000 GPU cluster, you need to simultaneously partition the in band, the out of band, and the compute fabric. Okay? Speaker 131:52 - 32:29
而这些通信会通过所谓的 in band network(带内网络)发生。然后还有 compute fabric(计算互连网络),也就是我刚才说的,所有各种 weights(权重)和 feature activations(特征激活)都会在这张 compute fabric 里共享。除此之外,还有一张 out of band monitoring network(带外监控网络),你可以通过它访问 BMC 或者你的一些 DPU。而当你试图从一个 10,000 GPU 的 cluster 中创建一个子分区时,你需要同时对 in band、out of band 和 compute fabric 进行分区。明白吗?
Speaker 132:29 - 33:19
So, like, that complex coordination between we've got a bunch of bare metal systems to, hey. We've got a virtualized system that has, what's called RDMA. RDMA, Remote Direct Memory Access, that allows them to read and write quickly, not just from the disks, but from each other's memory, the GPU's sort of HBM memory, and allow them to do that that sort of direct memory access, allowing it to go directly from a GPU to another GPU without getting copied to the CPU, for example. Having that all work is an immense, immense software undertaking. This is going back to the original question, like, well, what are people not getting about Neo Clouds? Speaker 132:29 - 33:19
所以,这里面那种复杂的协调工作——从“我们有一大批 bare metal(裸金属)系统”,到“我们有一个带有 RDMA 的虚拟化系统”——其中 RDMA 指的是 Remote Direct Memory Access(远程直接内存访问),它让这些系统不仅能快速读写磁盘,还能快速读写彼此的内存、GPU 的 HBM memory(高带宽内存);并且还能实现这种 direct memory access(直接内存访问),比如让数据直接从一个 GPU 到另一个 GPU,而不需要先复制到 CPU。要让这一整套都运转起来,是一项极其、极其庞大的软件工程。这也回到了最初那个问题:人们到底没有理解 Neo Clouds 的什么?
Speaker 133:19 - 34:15
Well, first of all, the answer is that most neo clouds don't have this kind of technology. Most neo clouds have not made the really, it's like kind of high tens to hundreds of millions of dollars of software investment that you need to make to build a real cloud system that can partition a high performance computing environment like this. And so that is and then to have it all work with the storage anyways, I guess that that sort of summarizes the steps that you need and kind of you can think about all the different moving parts of a modern like, how does an AI data center work? People tell AI data center. But really, you have to kind of go down that one next level down, which is because if you were to by the way, if you were to ask an AI data center landlord, a traditional one, what's going on inside of the data center, they'd be like, well, look, we're real estate people. Speaker 133:19 - 34:15
嗯,首先,答案是大多数 neo clouds 并不具备这种技术。大多数 neo clouds 并没有做出那种真正意义上的软件投入——大概需要数千万到上亿美元——你得投入这些,才能构建一个真正的 cloud system(云系统),去对这样的 high performance computing environment(高性能计算环境)进行分区和管理。所以这就是其中的关键;另外还得让这一切与 storage(存储)协同工作。总之,我想这大致概括了你需要的那些步骤。你也可以把它理解为现代系统中的各种 moving parts(活动部件):一个 AI data center(AI 数据中心)究竟是怎么运作的?人们会说“AI 数据中心”,但实际上你还得再往下一层去看。因为顺便说一句,如果你去问一个 AI data center 的 landlord(业主/房东),尤其是传统那种,“这个数据中心里面到底在发生什么?”他们多半会说:嗯,你看,我们是做房地产的。
Speaker 134:15 - 34:42
And we really outsource this to the GC. But the GC doesn't know, of course, anything that's going inside of this, it's their tenants who know. So this is what's actually happening inside of an AI data center. And then it serves the result. Also, going back to the community stuff, if people knew a lot more about, well, this AI data center is actually just serving the chat GBT request that I'm giving it. Speaker 134:15 - 34:42
而且我们实际上把这件事外包给了 GC。但 GC 当然也并不知道这里面具体发生了什么,真正知道的是他们的 tenants(租户)。所以,这才是 AI data center 内部实际上在发生的事情,然后它再把结果提供出来。再回到社区相关的话题,如果人们更清楚地知道,哦,原来这个 AI data center 实际上只是在处理我发出的 chat GBT 请求。
Speaker 134:43 - 34:46
Sometimes they don't even realize that that's actually what an AI data center does. Speaker 134:43 - 34:46
有时候他们甚至都没有意识到,这其实就是 AI data center 的作用。
Speaker 234:46 - 34:54
So you mentioned tenants. Do you rent them? Do you also own some, building some, and where does that fit in the overall strategy? Speaker 234:46 - 34:54
所以你提到了 tenants。你们是租用这些设施吗?你们自己也拥有一些、或者自己建设一些吗?这在整体战略里处于什么位置?
Speaker 134:55 - 35:51
Yeah. So, you know, initially we started off as being primarily a renter, and we've actually started to get into the business of financing some of them, the construction of them ourselves, as well as we're going now into full vertical integration where we are identifying land, coming to the table with a basis of design, which is basically all the engineering diagrams to construct the data center, financing and constructing that data center, putting the servers in, and then associating that with, like, a long term offtake agreement with one of the major compute consumers in the world and financing it all. So, like, we're getting to full vertical integration at Lambda, and it's been it's been great because we've been able to kind of, again, bring that engineering mindset to this problem, which was historically mostly run by people in real estate. Speaker 134:55 - 35:51
对。最开始我们主要是作为承租方,而现在我们实际上已经开始进入这样一块业务:我们会自己为其中一些项目提供融资,也会为它们的建设提供资金;同时我们现在也在走向完全的 vertical integration(垂直整合)。也就是说,我们会去识别土地,带着 basis of design(设计基础方案)进入项目——基本上就是建造数据中心所需的全部工程图纸——然后为这个数据中心融资并完成建设,把 servers(服务器)装进去,再把它和全球某个主要 compute(算力)消费方签订长期 offtake agreement(长期承购协议)关联起来,并为整套方案提供融资。所以,Lambda 正在走向完全的 vertical integration,而且这非常好,因为我们又一次能够把那种 engineering mindset(工程思维)带到这个问题上;而这个领域在历史上大多是由房地产行业的人在主导。
Speaker 235:51 - 35:58
In your own data centers, are you the sole tenant, or is part of the idea that you can also rent some to others? Speaker 235:51 - 35:58
在你们自己的数据中心里,你们是唯一的 tenant 吗?还是说你们的想法里也包括把一部分租给别人?
Speaker 135:58 - 36:19
In a lot of our data centers, we are the sole tenant. In terms of the data centers that we're planning on constructing, we don't yet have any plans to lease that space to others. So we're not trying to get into the the leasing data center business. Maybe that's something that you can imagine down the road. I wouldn't write it out completely. Speaker 135:58 - 36:19
在我们很多数据中心里,我们确实是唯一的 tenant。至于我们计划建设的那些数据中心,目前我们还没有任何把这些空间租给其他人的计划。所以我们并不是想进入 data center leasing(数据中心租赁)这个业务。也许你可以想象未来某一天会这样做,我不会把这种可能性完全排除。
Speaker 136:19 - 36:24
But for now, we have to focus on providing Lambda with the compute that we need to service the market. Speaker 136:19 - 36:24
但就目前而言,我们必须专注于为 Lambda 提供服务市场所需的 compute。
Speaker 236:24 - 36:26
How international are you, by the way? Speaker 236:24 - 36:26
顺便问一下,你们的国际化程度怎么样?
Speaker 136:26 - 37:06
I'd say that we're very much focused on North America. And so we have data centers in Canada, United States, and Mexico. We're very much, like I say, primarily focused on North America, but really within that, The United States, obviously. And we we haven't had this desire internally to try to go and expand into Europe or or too far into Asia. We've done some partnerships with some of our great investors like SK Telecom, and we have a data center that we've operated in Korea, in Seoul. Speaker 136:26 - 37:06
我会说,我们的重点非常明确地放在 North America。我们在 Canada、United States 和 Mexico 都有 data center(数据中心)。就像我说的,我们主要聚焦 North America,但在这之中,显然尤其是 The United States。我们内部其实一直没有那种强烈意愿,要去扩展到 Europe,或者太深入 Asia。我们确实和一些很棒的投资方做过合作,比如 SK Telecom,而且我们也在 Korea 的 Seoul 运营过一个 data center。
Speaker 137:06 - 37:15
And so we have some experience with international. But right now, we're just like, look. Let's focus on The US market. It's where the opportunity is. Speaker 137:06 - 37:15
所以我们在国际业务方面是有一些经验的。但现在我们的想法就是:看,先专注 The US market。这才是机会所在。
Speaker 237:15 - 37:21
Do you need, for performance reasons, to be close to the customer the way you you need to have regions in the cloud? Speaker 237:15 - 37:21
出于性能原因,你们是否需要像 cloud(云)那样靠近客户、设立不同 regions(区域)?
Speaker 137:21 - 37:45
You know, it's super interesting. A lot I get this question a lot. And people they're like, well, does latency matter? Does so I'll tell you what what matters and what doesn't matter. You can look at your own utilization of whether it's ChatGPT or Clawd or Grok or Gemini, and you can see, hey, a lot of the things that I'm doing, I kind of shoot it off. Speaker 137:21 - 37:45
你知道,这个问题特别有意思。很多人都会问我这个。大家会说,嗯,latency(延迟)重要吗?所以我来告诉你,到底什么重要,什么不重要。你可以看看你自己使用 ChatGPT 或 Clawd 或 Grok 或 Gemini 的方式,你会发现,很多你在做的事情,其实就是把任务发出去。
Speaker 137:45 - 37:57
I come back later, and there's a research report for me. Maybe it's a long running agent workflow. In those cases, latency doesn't matter at all. The only thing that matters is your cost per token. That's all that matters. Speaker 137:45 - 37:57
过一会儿我再回来,就已经有一份 research report 给我了。也许这是一个长时间运行的 agent(智能体)workflow(工作流)。在这些情况下,latency 完全不重要。唯一重要的,就是你的每 token 成本。那才是唯一重要的事。
Speaker 137:58 - 38:24
And so that's been a really interesting change. Think that the old school, traditional legacy cloud business was so latency focused because of some of the applications. But this new fleet of AI applications are far less latency sensitive, so that's one. But there is the caveat, which is this. Governance and data governance is becoming an important thing. Speaker 137:58 - 38:24
所以这是一个非常有意思的变化。我认为,老派的、传统的 legacy cloud(传统云)业务之所以那么强调 latency,是因为某些应用场景确实如此。但这一波新的 AI 应用,对 latency 的敏感度要低得多,这是一点。不过这里也有一个 caveat(但书),就是:governance(治理)以及 data governance(数据治理)正变得越来越重要。
Speaker 138:24 - 38:44
And a lot of countries are wanting to have the AI compute that their citizens are using be run out of their own country so that they can at least have their perception of control or whatever. And that is an element to it. But I'd say that from the latency, there's no technical reasons. Speaker 138:24 - 38:44
很多国家都希望,本国公民所使用的 AI compute(AI 算力)能够在本国境内运行,这样他们至少能保有一种控制感,或者类似的东西。这确实是其中一个因素。但如果单从 latency 来说,我会认为并不存在技术层面的理由。
Speaker 238:44 - 38:52
Let's talk about the financing stack. So presumably, it's a commission of equity and and debt. How does it all work? Speaker 238:44 - 38:52
我们来谈谈融资 stack(结构)。所以按理说,它是 equity(股权)和 debt(债务)的组合。整个机制是怎么运作的?
Speaker 138:52 - 39:26
Yeah. So the way that it works is that you know, you could really fragment it into these two parts, which is like financing your on demand cloud versus financing an offtake agreement, which is like a longer term commitment. And on the on demand cloud, you're kind of looking at Lambda's credit quality. On the offtake agreement, you're kind of looking at the credit quality of the the end customer who's paying the bill. And so what you do is you just take your offtake agreement. Speaker 138:52 - 39:26
对,运作方式基本上可以拆成两部分:一部分是为你的 on-demand cloud(按需云)融资,另一部分是为 offtake agreement(承购协议)融资,后者更像是一种长期承诺。对于 on-demand cloud,你看的更多是 Lambda 自身的信用质量;对于 offtake agreement,你看的更多是最终付账客户的信用质量。所以你要做的,就是拿你的 offtake agreement 来操作。
Speaker 139:26 - 40:08
You take this chunk of GPUs that you're deploying. You take a lease or the property, and you put it into a box. And you can go to the private credit markets, and you can come up with an asset based loan. You can you can get a a variety there's a variety of different methodologies for financing it. Most of it is just some sort of, like, special purpose vehicle that's designed to finance this particular deployment with a very known and easy to underwrite, which is basically just a fancy way of saying the, you know, finance term for just assessing the risks and the downsides of a particular credit investment. Speaker 139:26 - 40:08
你把这批正在部署的 GPUs,加上 lease(租约)或 property(物业/资产),一起装进一个“盒子”里。然后你就可以去 private credit markets(私募信贷市场),做一笔 asset-based loan(资产支持贷款)。融资方式其实有很多种。大多数情况下,本质上就是设立某种 special purpose vehicle(特殊目的载体,SPV),专门为这次特定部署融资;它的风险已知,也很容易 underwrite(承做风险评估),说白了,这只是金融术语,意思就是评估某一笔 credit investment(信贷投资)的风险和下行空间。
Speaker 140:08 - 40:46
And there's a vibrant credit market for that. On the on demand cloud side of things, it's not quite as mature as when there's, for example, an investment grade off take agreement. But it's becoming more and more mature. And in general, creditors and lenders are really starting to understand the value of an NVIDIA chip. Because, you know, you actually look at the chips that we deployed in 2023, h one hundreds. Speaker 140:08 - 40:46
而且这方面的信贷市场相当活跃。至于 on-demand cloud 这一边,它还没有成熟到比如 investment grade(投资级)的 off take agreement(承购协议)那种程度,但也在变得越来越成熟。总体来说,creditors(债权人)和 lenders(放贷方)已经开始真正理解 NVIDIA chip(芯片)的价值。因为如果你去看我们在 2023 年部署的那些 chips,也就是 h one hundreds。
Speaker 140:46 - 41:16
We're now leasing those out at a higher rate now than we were originally in 2023. So so these creditors are starting to look at these assets and say, wow. This is an asset that is very valuable and also easy for us to underwrite. And, of course, while they are underwriting towards the actual cash flows that are coming out of that agreement, just as an asset class overall, people are realizing that this is a really great opportunity, and so creditors are starting to flock to these deals. Speaker 140:46 - 41:16
我们现在把这些卡租出去的价格,比 2023 年刚开始时还更高。所以这些债权人开始看着这类资产说,哇,这是一种非常有价值、而且也很容易做 underwrite(风险评估)的资产。当然,他们在做评估时,仍然会围绕协议实际产生的 cash flows(现金流);但就整个 asset class(资产类别)而言,大家都在意识到,这是一个非常好的机会,因此债权人正开始涌向这类交易。
Speaker 241:16 - 41:32
You're renting an 800 at a higher rate because why? Because the demand for compute is so rapid that people will take NE or the technical depreciation of the of the product is slower than people thought. What what drives that? Speaker 241:16 - 41:32
你们把 800 租出更高的价格,原因是什么?是因为 compute(算力)需求增长太快,以至于市场愿意接盘,还是因为产品的 technical depreciation(技术性贬值)比人们原来预期得更慢?到底是什么在驱动这一点?
Speaker 141:32 - 41:58
Well, what's driving it I mean, it's certainly it's the the demand being high increases the price that you're able to get in the market. There's no question about that fundamental law. Again, going back to what people didn't understand about this market. There was people who were saying, oh, well, there's a, you know, there's a five year lifetime or three year lifetime. I even heard some people say three year lifetime for these GPUs. Speaker 141:32 - 41:58
驱动因素嘛,当然首先是需求旺盛会推高你在市场上能拿到的价格,这一点毫无疑问,属于最基本的规律。还是回到大家之前没有理解这个市场的地方。当时有些人会说,哦,这类 GPUs 的寿命是五年,或者三年。我甚至还听到有人说这些 GPUs 只有三年寿命。
Speaker 141:58 - 42:22
This is completely false. We have GPUs that we've commissioned, and we're one of the earliest NeoClouds, in fact. We we're we're probably the only one only neo cloud that actually has GPUs in our fleet that are fully depreciated from an accounting perspective, right, which is most people are adopting around a six year accounting depreciation schedule. But that's not the usable life. The usable life is longer than the accounting depreciation schedule. Speaker 141:58 - 42:22
这完全是错误的。我们有一些已经投入使用的 GPUs,而事实上我们是最早的一批 NeoClouds 之一。我们可能也是唯一一家 neo cloud:我们的 fleet(设备机群)里确实有一些 GPUs,从 accounting(会计)角度看已经 fully depreciated(完成折旧)了,对吧;因为大多数公司采用的会计折旧周期大约是六年。但那并不等于可用寿命。可用寿命是长于会计折旧周期的。
Speaker 142:22 - 42:36
And what really matters is the economic usable life. And so what we're starting to see is that the people who are the naysayers, oh, this is gonna be you're gonna throw these GPUs out in five years, are completely wrong. They're completely wrong, and they've been wrong the entire time. Speaker 142:22 - 42:36
真正重要的是它在经济上的可用寿命。所以我们现在开始看到,那些唱衰的人——哦,这些 GPU 五年后就得扔掉——完全错了。他们完全错了,而且一直以来都错了。
Speaker 242:36 - 42:49
Do you think there is going to be, or do you already see happening some kind of a financial market for compute units with trading and derivatives. Is that is that happening? Speaker 242:36 - 42:49
你觉得会不会出现,或者你是否已经看到某种针对 compute units(计算单元)的金融市场,带有交易和 derivatives(衍生品)?这件事正在发生吗?
Speaker 142:49 - 43:45
I'm starting to see some people, you know, start to examine what a maybe vibrant spot market you know, first, you need to have a spot market for something before then you can establish, you know, a derivative like a future or or other other more exotic things. I'm starting to see that, but fundamentally, I think that the the the asset class is just starting to mature, and creditors are starting to become very comfortable with investing in the credit side of buying NVIDIA GPUs and deploying them into data centers. And we don't need to get too fancy with it. That's kind of my part of my opinion is that I think that that market is starting to mature that maybe an eventuality is having more complex securities that surround GPUs. But I think for right now, people are starting to realize it's a it's a great credit investment. Speaker 142:49 - 43:45
我开始看到一些人在研究,也许会先出现一个活跃的 spot market(现货市场);首先,你得先有一个现货市场,之后你才能建立 derivative(衍生品),比如 future(期货)或者其他更复杂、更“异域”的东西。我开始看到这种苗头,但从根本上说,我认为这个 asset class(资产类别)才刚刚开始成熟,而 creditors(债权人)也开始对投资于购买 NVIDIA GPUs 并将其部署到 data centers(数据中心)这一 credit side(信贷端)感到非常放心。我们其实不需要把它搞得太花哨。我的一部分看法是,这个市场正在走向成熟,未来也许会出现围绕 GPU 的更复杂的 securities(证券),但我认为就目前而言,人们开始意识到,这是一项非常好的 credit investment(信贷投资)。
Speaker 143:45 - 43:54
And that's that's what's changed, I'd say, over the last year is that people have started to really treat it like a a more mature asset class. Speaker 143:45 - 43:54
我会说,这就是过去一年里发生的变化:人们开始真正把它当作一种更成熟的 asset class(资产类别)来看待。
Speaker 243:54 - 44:06
Maybe quickly just go back to the very origin because I I think you've been in the effectively in the AI world the the whole time, but are coming from a very different angle, with multiple pivots. What what did you start with and and when? Speaker 243:54 - 44:06
也许我们可以很快回到最初的起点,因为我觉得你其实一直都身处 AI 这个世界里,只是切入角度非常不同,而且中间经历了多次 pivot(转向)。你最开始做的是什么?又是什么时候开始的?
Speaker 144:06 - 44:54
Well, you know, with the complexity of the business, you can now see, you know, the complexity, the capital intensity, just the sort of not fitting into a box. You can see why we've oftentimes not had a lot of traditional venture investors in Lambda. And of our investors have done exceptionally well, but they've kind of come from more often than not outside of traditional, let's say, mainline Silicon Valley VCs. So just going back to the origin story, I started Lambda in 2012, and we were a facial recognition software company. So I was training convolutional neural networks to do face and image recognition, and we eventually hosted that on an API. Speaker 144:06 - 44:54
嗯,你知道,随着这个业务的复杂性提升——你现在也能看出来,它的复杂程度、资本密集度,以及那种很难被归进某个固定框里的特性——你就能明白,为什么 Lambda 往往没有吸引到很多传统意义上的 venture investors(风投投资人)。当然,我们的投资人表现都非常好,但他们更多时候并不是来自传统的、所谓主流的 Silicon Valley VCs。回到最初的故事,我是在 2012 年创办 Lambda 的,当时我们是一家 facial recognition(人脸识别)软件公司。所以我当时在训练 convolutional neural networks(卷积神经网络)来做人脸和图像识别,后来我们把这项能力托管在一个 API 上。
Speaker 144:54 - 45:19
I was training those convnets on a 4x NVIDIA GTX five eighty workstation that I had bought from a friend who had built it, actually. And this was really pretty avant garde stuff at the time. Most people didn't really believe in what was called the field called deep learning at the time. Speaker 144:54 - 45:19
我当时是在一台 4x NVIDIA GTX 580 workstation(工作站)上训练这些 convnets(卷积网络)的,那台机器其实是我从一个朋友那里买来的,是他自己组装的。放在当时,这真的算是相当前卫的东西。那时候大多数人其实并不相信当时那个被称作 deep learning(深度学习)的领域。
Speaker 245:19 - 45:24
And that was inspired by the ImageNet twenty twelve moment, or that was even before that? Speaker 245:19 - 45:24
那是受到 ImageNet 2012 那个时刻的启发吗,还是说甚至比那还要早?
Speaker 145:24 - 45:48
The ImageNet moment, you know, I pulled the CUDA COVNET repo off of Google code. That's how you know how old Lambda is, is that Google code was still around. And I pulled the CUDA COVNET code base and was, like, playing around with it. I got very lucky that the AlexNet paper had been published the same year that Lambda was founded. It's not a coincidence at all. Speaker 145:24 - 45:48
你知道,那个 ImageNet 时刻,我从 Google code 上拉下了 CUDA COVNET repo。你就能看出来 Lambda 有多老了——那时候 Google code 还在。我把那套 CUDA COVNET code base 拉下来后,就一直在上手折腾。我也非常幸运,AlexNet 论文正好是在 Lambda 成立的同一年发表的。这完全不是巧合。
Speaker 145:48 - 46:18
It's not a coincidence at all. We launched this face recognition API, got a couple thousand users, but it wasn't really generating a ton of cash. As part of the complex story of startups, in parallel, I found these guys who had just graduated from their PhD programs, gentlemen Zach and Nico. And they had said, Hey, we're gonna start a company. I said, Hey, let me help you guys out. Speaker 145:48 - 46:18
这完全不是巧合。我们上线了这个人脸识别 API,拿到了几千个用户,但它并没有真正带来很多现金流。按照 startup(创业公司)那种复杂故事的一部分,与此同时,我认识了这几个人——刚从 PhD 项目毕业的两位,Zach 和 Nico。他们当时说,嘿,我们要创业了。我说,嘿,让我来帮帮你们。
Speaker 146:18 - 46:37
I'm gonna work with you for a year. I'm gonna learn a little bit more about neural networks. I And helped them out on this company, helped them get a company called Perceptio started. I was the first employee there while I was running Lambda. We were running these convnets locally on the iPhone. Speaker 146:18 - 46:37
我要和你们一起干一年。我也想多学一点 neural networks(神经网络)。后来我帮他们做这家公司,帮他们把一家叫 Perceptio 的公司启动起来。我在那儿是第一位员工,同时我还在运营 Lambda。我们当时是在 iPhone 本地跑这些 convnets。
Speaker 146:38 - 47:16
And again, this is 2013, so we were we were using the GPU image library and just straight open GLES shaders, like those shaders that are used for rendering. We were using those to run the convnets on the iPhone. Eventually, I left to go continue to work on Lambda full time, and then probably about a year or so later, they got acquired by Apple. If you know the feature on your iPhone where you swipe up on an image and you recognize faces and search through your library, that's maybe some of the stuff that eventually got integrated into iOS through that acquisition. And then Lambda, we continued on. Speaker 146:38 - 47:16
再说一次,那可是 2013 年,所以我们当时用的是 GPU image library,还有最原始的 open GLES shaders——就是那种拿来做 rendering(渲染)的 shader。我们就是用这些东西在 iPhone 上跑 convnets。后来,我离开了,继续全职做 Lambda;大概又过了一年左右,他们被 Apple 收购了。如果你知道 iPhone 上那个功能——在图片上向上滑,就能识别人脸、搜索你的图库——那其中也许就有一些东西,最后是通过那次收购被整合进 iOS 的。然后 Lambda 这边,我们继续往前做。
Speaker 147:16 - 47:30
We had a variety of different products, everything from Lambda Hat, which was a baseball cap that took a camera every ten sec took a picture every ten seconds with a camera embedded in the tip of the brim for gathering data sets for image and face recognition. Speaker 147:16 - 47:30
我们做过各种不同的产品,什么都有,比如 Lambda Hat——那是一顶棒球帽,帽檐前端嵌了一个摄像头,每十秒拍一张照片,用来为图像识别和人脸识别收集数据集。
Speaker 247:30 - 47:37
Which is fascinating because fast forward to today, and that's a whole segment, right? Capturing everyday life to train the AI? Speaker 247:30 - 47:37
这就很有意思了,因为快进到今天,这已经成了一个完整的细分赛道,对吧?捕捉日常生活,用来训练 AI?
Speaker 147:37 - 48:08
It goes to show you have to One, it's important to be able to see the future. It's also important to get your timing right as well, right? And now it all worked out, right, despite maybe that Lambda Hat product not being great, but it taught me a lot about how to build hardware. I was I lived in Shenzhen for a little bit, working on the PCB and spinning the PCB and designing the actual hardware product. And it taught me how to make consumer electronics. Speaker 147:37 - 48:08
这说明了两点:第一,能够看见未来很重要;第二,把握好时机也同样重要,对吧?而现在回头看,一切最终都算是有了结果。虽然 Lambda Hat 这个产品本身可能并不算成功,但它确实让我学到了很多关于如何做硬件的东西。我还在 Shenzhen 住过一阵子,做 PCB,反复迭代 PCB,并设计实际的硬件产品。这些经历教会了我怎么做消费电子。
Speaker 148:08 - 48:36
And that was actually a huge, huge skill because it totally opened my mind to new ways of doing business that aren't just making apps. Right? And eventually, we had this product called Dreamscope, which became really popular in 2015 and '16. And it was basically using the Google Deep Dream methodology of using a ConvNet to generate images. It's like an early version of mid journey or whatever. Speaker 148:08 - 48:36
而那其实是一项非常非常重要的技能,因为它彻底打开了我的思路,让我看到了不只是做 app 的新商业方式。对吧?后来,我们有了一个叫 Dreamscope 的产品,它在 2015 和 2016 年变得非常火。它本质上就是使用 Google Deep Dream 那套方法,用 ConvNet 来生成图像。某种意义上,它有点像 mid journey 之类产品的早期版本。
Speaker 148:37 - 49:07
And Deep Dream and the Leon Gatti style transfer algorithm allowed you to turn a photo into a painting, basically. And we got a million users on that, processed tens of millions of images, maybe 15,000,000 images or something like this. And that caused us to have a huge AWS bill. It was like $40,000 a month or something. And so to replace that, we'd ended up building a little cluster out of workstations. Speaker 148:37 - 49:07
然后,Deep Dream 和 Leon Gatti 的 style transfer algorithm(风格迁移算法)基本上让你可以把一张照片变成一幅画。那个产品拿到了 100 万用户,处理了数千万张图片,大概有 15,000,000 张之类的。结果就是我们收到了巨额 AWS 账单,大概每个月 $40,000 左右。为了替代这个,我们最后用一堆工作站搭了一个小型 cluster(集群)。
Speaker 149:07 - 49:29
And then there was a $60,000 CapEx that we were terrified to make, by the way. We were so, so scared that doing this CapEx was gonna put us out of business. We made it out of workstations because we thought, oh, well, worst case scenario, we can just sell them. And so lo and behold, we did end up, you know, turned it online, and it brought the bill down to zero. So it paid itself back in a month and a half. Speaker 149:07 - 49:29
顺便说一句,当时还有一笔 $60,000 的 CapEx(资本性支出)我们根本不敢花。我们当时非常非常害怕,觉得这笔 CapEx 可能会直接把公司搞垮。我们之所以用工作站来搭,是因为我们想,最坏情况下,还可以把它们卖掉。结果不出所料,我们真的把它上线了,然后账单就降到了零。所以它在一个半月内就回本了。
Speaker 149:29 - 50:00
And we thought, oh, this is like, we're saving more money than we're making. Maybe we should be in the business of providing compute to other AI researchers. And thus, we started selling workstations and servers and started developing our cloud platform, maybe did $3,000,000 of revenue in 2017 that first year selling workstations, then 10,000,000 in 2018, then 30,000,000 in 2019. We grew the hardware business over the next couple of years to probably about $200,000,000 run rate. And then the cloud business, we really started in 2019. Speaker 149:29 - 50:00
然后我们就想,哦,这个事情有点意思——我们省下的钱比赚到的钱还多。也许我们应该去做给其他 AI 研究者提供 compute(算力)的生意。于是我们开始卖工作站和服务器,也开始开发我们的 cloud platform(云平台)。2017 年,也就是第一年卖工作站时,收入大概做到了 $3,000,000;2018 年到了 $10,000,000;2019 年到了 $30,000,000。接下来的几年里,我们把 hardware business(硬件业务)做到了大概 $200,000,000 的 revenue run rate(收入年化运行率)。而 cloud business(云业务)则是我们在 2019 年真正开始做起来的。
Speaker 150:00 - 50:27
And we started developing before then, but we started really marketing it. And it kind of was slow to grow, to be honest, because not a lot of people in 2018 and 'nineteen and 2020 wanted a bunch of AI compute. There was a pretty niche market for it. But eventually, our cloud business continued to grow, and now it's at, you know, a little bit under a billion dollar revenue run rate. We've fully exited the hardware business. Speaker 150:00 - 50:27
在那之前我们其实就已经开始开发了,但那时才真正开始 marketing(市场推广)。老实说,它一开始增长得有点慢,因为在 2018、2019 和 2020 年,并没有很多人想要大量 AI compute(AI 算力)。当时这是个相当 niche(小众)的市场。不过最终,我们的云业务还是持续增长,现在它的 revenue run rate(收入年化运行率)已经接近 10 亿美元。我们也已经完全退出了硬件业务。
Speaker 150:27 - 50:34
And so, yeah, Lambda's got a absolutely wild founding story, to to summarize. Speaker 150:27 - 50:34
所以,嗯,总结一下,Lambda 的创始故事确实非常传奇。
Speaker 250:34 - 50:42
Are some of the people that were there at the beginning still around? I think you started the company with your brother. Is that right? And your brother is is still at the company? Speaker 250:34 - 50:42
当初最早那批人在现在还在吗?我记得你是和你 brother 一起创办这家公司的,对吗?而且你 brother 现在也还在公司里?
Speaker 150:42 - 51:22
Yeah. And so in terms of, like, the early people, basically, it's not I mean, not even basically. Of the four people who are making Dreamscope, me, Michael Balaban, my cofounder and fraternal twin brother, Chuang Li, who's our chief scientific officer, and then Steve Clarkson, who's an engineering leader at the company and has a bunch of folks reporting into him. Now, they're all still at the company. The next hire, one of those gentlemen named Mitesh Agrawal, who was one of the next hires in that team, he was with the company for maybe eight years or something like this. Speaker 150:42 - 51:22
对。说到最早那批人,基本上——其实也不用说“基本上”——当年做 Dreamscope 的四个人,也就是我、我的 cofounder(联合创始人)兼异卵双胞胎 brother Michael Balaban、我们的 chief scientific officer(首席科学官)Chuang Li,以及 Steve Clarkson——他现在是公司里的 engineering leader(工程负责人),下面带着不少人——他们现在都还在公司。再往后的下一位招聘进来的人里,有一位叫 Mitesh Agrawal 的先生,他是那个团队后续较早加入的人之一,在公司大概待了八年左右。
Speaker 151:23 - 51:58
Five, six. Yeah, something like eight years. And then he eventually left and joined another former Lambda team member, Thomas Summers, to start Positron, which is an accelerator company. And they're now valued at over $1,000,000,000. So not only has the original team stuck around, but we've already started to kind of see what a Lambda alumni, a Lambda mafia network looks like in in the world. Speaker 151:23 - 51:58
五六年。对,差不多八年吧。后来他最终离开了,并和另一位前 Lambda 团队成员 Thomas Summers 一起创办了 Positron,这是一家 accelerator(加速器)公司。现在他们的估值已经超过 $1,000,000,000。所以,不仅最初的团队留了下来,我们甚至已经开始有点看到所谓 Lambda alumni(前员工)网络,或者说 Lambda mafia network,在这个世界里逐渐成形。
Speaker 151:58 - 52:00
Lambda lab member alumni. Speaker 151:58 - 52:00
Lambda lab 的成员校友。
Speaker 252:00 - 52:05
How did you keep the the band together during the difficult times? Speaker 252:00 - 52:05
在那些艰难时期,你是怎么让团队继续凝聚在一起的?
Speaker 152:05 - 52:37
Just when you're running a startup company that's this capital intensive, working capital intensive as well. It's just you get a lot of shocks to the system as you're growing. And then COVID, in April of COVID, software companies were feeling great because they could ship software and there was so much more demand. Harbor companies, the docks were closed, you couldn't ship revenue in April and March. And so, I mean, I remember all these things really distinctly. Speaker 152:05 - 52:37
当你经营一家像这样资本密集、同时也高度依赖营运资金的 startup company(初创公司)时,随着公司的成长,整个系统会不断受到各种冲击。然后又遇上了 COVID,在 COVID 暴发后的 4 月,software companies(软件公司)感觉都很好,因为他们还能交付软件,而且需求还更高了。可像 Harbor 这样的公司,码头都关闭了,你在 3 月和 4 月根本没法把收入发出去。所以,我是说,我对这些事都记得特别清楚。
Speaker 152:37 - 53:24
I think I remember just getting in front of the team, like, hey, look, it's really tough right now. And, there's certainly a feeling that we're not sure if we're going to make it through this. But the only thing to do is just to suck it up and enjoy the pain, run through it, and come up with the solutions to the problems that you're presented with, all in the service of delighting customers. Because fundamentally, I mean, the big thing is just aligning people towards the only reason we're all here is to build something that people want and they love so much that they tell their friends about it and they give you money. And then everything else, it just follows from that customer experience of delighting customers with what you do. Speaker 152:37 - 53:24
我记得我当时就是站到团队面前,说,大家看,现在真的很难。而且,我们确实有一种感觉,就是不确定自己能不能挺过这一关。但唯一能做的,就是咬牙扛下来,甚至享受这种痛苦,冲过去,把摆在面前的问题一个个想办法解决,而这一切都是为了让客户感到惊喜。因为从根本上说,最重要的事情就是让所有人都对齐到一点上:我们之所以都在这里,唯一的原因,就是去打造人们真正想要的东西,打造他们喜欢到会主动推荐给朋友、并愿意为之付钱的东西。其他一切,其实都源自这种客户体验——也就是你通过自己所做的事让客户感到惊喜。
Speaker 153:24 - 53:56
When we do onboarding, for example, I used to do this thing called Lambda 101, and we would show a picture of, like, a Linux penguin. And he was, like, on a Lambda workstation, and he was reading the GPT two paper. And training had a loss curve, which is, like, what you see and you look at as you're if you're doing machine learning research. Was like, look. Just put yourselves in the shoes of the penguin who's training using this workstation or cloud service to train a neural network and just think about what's gonna delight them. Speaker 153:24 - 53:56
比如我们做 onboarding(入职培训)的时候,我以前会做一个叫 Lambda 101 的环节,我们会展示一张图片,里面有一只 Linux 企鹅。它坐在一台 Lambda workstation(工作站)前,正在读 GPT-2 paper(论文)。训练界面上还有一条 loss curve(损失曲线),这基本就是你做 machine learning research(机器学习研究)时会看到、会盯着看的东西。我当时想表达的是:设身处地把自己放到这只企鹅的位置上,想象它正用这台 workstation(工作站)或 cloud service(云服务)训练一个 neural network(神经网络),然后去思考,什么会真正让它感到惊喜。
Speaker 153:56 - 54:30
You know, whether it's, you know, people on our shipping team who said, hey, let's put some t shirts inside of the boxes. And so every workstation came with a Lambda t shirt. Or members of the data center operations team said, hey, you know what? We should do a white rack because that'll set us apart and make everything look good, and we'll be really proud to showcase that. And those are the types of things that, as you imbue your company with the kind of delight the customer first mentality that I think helps you get through the hard times. Speaker 153:56 - 54:30
比如说,我们 shipping team(发货团队)里有人提议:要不我们在箱子里放几件 T 恤吧。于是每台 workstation(工作站)都会附带一件 Lambda T-shirt。又比如 data center operations team(数据中心运维团队)里有人说,你知道吗?我们应该用 white rack(白色机架),这样会让我们显得与众不同,也会让整体看起来更漂亮,我们自己展示出来也会特别自豪。像这样的事情,就是当你把那种“以客户惊喜为先”的 mentality(思维方式)注入公司之后,会帮助你熬过困难时期的那类东西。
Speaker 254:30 - 54:46
Recent evolution in that journey is that you just brought on a new CEO and my fellow French countryman, Michel Combe, to run the business. Walk us through the thinking and what led you to make the decision and how that equips the company for the next chapter. Speaker 254:30 - 54:46
这段历程最近的一个新变化是,你刚刚请来了一位新的 CEO,也就是和我一样来自法国的 Michel Combe,来负责公司的业务。跟我们讲讲你背后的思考吧:是什么促使你做出这个决定,以及这个决定会如何让公司为下一个阶段做好准备。
Speaker 154:46 - 55:23
It's a huge honor as a founder to get to the point where the company can afford to bring on amazing talent like Michel in that seat. Just because if you think about it, I'd say most companies, it's not uncommon for somebody to say, hey, look, a lot of people, sometimes maybe there's component of ego involved where they have to be the founder, CEO. I've never really personally had that. I care about the technology, as you can tell. I care about building a great generational company. Speaker 154:46 - 55:23
对创始人来说,能走到公司有能力请来像 Michel 这样优秀的人才坐上这个位置,是一种巨大的荣誉。因为如果你仔细想想,我会说大多数公司里,别人这么讲其实并不少见:很多人有时候——也许其中确实有 ego(自我)的成分——会觉得自己必须同时是 founder(创始人)和 CEO。我个人其实从来没有那种执念。你也看得出来,我在乎的是技术。我在乎的是打造一家真正伟大的、能够跨时代传承的公司。
Speaker 155:23 - 55:40
And I think there's so many different seats to do that from. And so getting to the point of maturity where we could afford to bring on CEO like Michel, who has experience. Like, he obviously, previously, SoftBank International CEO, Sprint CEO. Speaker 155:23 - 55:40
我觉得可以从很多不同的位置来做这件事。所以当公司发展到足够成熟、能够请来像 Michel 这样有经验的 CEO 时,这就是一个重要节点。比如他之前显然做过 SoftBank International CEO、Sprint CEO。
Speaker 255:40 - 55:41
Al Gattel. Speaker 255:40 - 55:41
Al Gattel。
Speaker 155:42 - 56:22
Al Gattel, he's on the board of some really amazing companies, including McLaren, which is a kind of a fun one. I always did the fundraising and capital formation and day to day business management as a necessity and not because that's what I really love doing, for example. And I think there's plenty of founder CEOs who absolutely love every aspect of their CEO job. I think that privately, and it'll be very hard for you to get this out of any founder CEO oftentimes. But, like, secretly, when I talk with founder CEOs, I'm always like, yeah. Speaker 155:42 - 56:22
Al Gattel,他还是一些非常了不起的公司的董事会成员,包括 McLaren,这个就挺有意思的。我一直以来做 fundraising(融资)、capital formation(资本形成)以及日常业务管理,更多是出于现实需要,而不是因为我真的热爱这些事。比如说,我觉得确实有不少 founder CEO(创始人 CEO)是真心热爱 CEO 工作的方方面面。我觉得私下里——而且很多时候你很难从任何 founder CEO 嘴里听到这个——但是,像我和 founder CEO 私下聊天时,我总会说,嗯,对。
Speaker 156:22 - 56:26
So, like, how much do you hate? I Speaker 156:22 - 56:26
所以,比如说,你到底有多讨厌——我
Speaker 256:27 - 56:33
find it shocking that, people don't find speaking to VCs all day exciting, but I will definitely work for it. Speaker 256:27 - 56:33
觉得很震惊,居然有人不会觉得整天和 VC(风险投资)聊天很令人兴奋,不过如果需要的话,我当然还是会去做。
Speaker 156:33 - 57:22
It's been like an amazing experience for me that I to be able to form a team around the company and to just see everybody flourishing in the things that they love to focus on. So for example, now that I'm the CTO, one of the main things I'm focused on is what does rapid data center deployment look like at the company? And kind of working to say like, hey, I want Lambda to be this sort of vertically integrated, high velocity powerhouse that so when you look at the world, you say, alright. There's two people in the world that can and two companies in the world that can do high velocity deployments. SpaceX AI and Lambda, where we're just extremely focused on how do you cut every little piece out of the process to stand up compute faster. Speaker 156:33 - 57:22
对我来说,这一直是一段非常棒的经历:能够围绕公司组建一支团队,也能看到每个人都在自己真正喜欢专注的事情上蓬勃发展。比如说,现在我是 CTO,我主要关注的一件事就是,公司里的 rapid data center deployment(数据中心快速部署)应该是什么样子。也就是努力去推动这样一个方向:我希望 Lambda 成为那种 vertically integrated(一体化)、high velocity(高速度)的强力机器。这样当你放眼世界时,你会说,好,世界上只有两个人能做到,或者说只有两家公司能做到 high velocity deployments(高速部署):SpaceX AI 和 Lambda。我们的核心关注点就是,如何把流程中每一个细小环节都尽可能砍掉,以更快地把 compute(算力)部署起来。
Speaker 157:22 - 57:32
And that's, like, something I've just been diving into and really enjoying with my new time as a CTO. Speaker 157:22 - 57:32
而这正是我最近作为 CTO 一直在深入投入、并且非常享受的一件事。
Speaker 257:33 - 57:37
What was XAI's record, like, when they launched Speaker 257:33 - 57:37
XAI 当时发布的时候,创下的纪录是什么样的?
Speaker 157:37 - 57:39
I think it was, like, 200. Speaker 157:37 - 57:39
我记得大概是 200。
Speaker 257:39 - 57:44
Yes. And you think that can be matched or exceeded at a repeatable pace? Speaker 257:39 - 57:44
对。所以你觉得这个数字可以在可重复的节奏下被追平,甚至超过吗?
Speaker 157:44 - 57:46
I think it it can be matched or beat. Speaker 157:44 - 57:46
我觉得可以追平,或者超过。
Speaker 257:46 - 57:49
Yeah. And that's processed mostly? Speaker 257:46 - 57:49
对。那这主要是流程方面的优化吗?
Speaker 157:49 - 58:25
I think it's it's everything from, like, the site selection process, the set of constraints that you use in a site selection process, the MEP pipeline, the way that you construct the data center, how do you make it so that the end customer will consume that compute, You know? And and how do you cut out a lot of stuff out of the process? Because oftentimes, you know, the people who've been designing these data centers have really kind of been real estate people, as I've mentioned, who've been kind of grabbed by the scruff of their neck by a hyper scale. They're like, go and build this design. Here, go. Speaker 157:49 - 58:25
我觉得这涵盖了一切,比如 site selection(选址)流程、你在 site selection 过程中使用的那套约束条件、MEP pipeline、你建设 data center 的方式,以及你怎么让最终客户去消耗这些 compute。你知道吗?还有,你怎么把流程里很多多余的东西砍掉。因为很多时候,设计这些 data center 的人,正如我提到的,实际上更像是做房地产的人,他们基本上是被某个 hyper scale 揪着脖领子拽过来,说,去把这个设计建出来。给你,去吧。
Speaker 158:25 - 58:43
Go get a GC. Run off. And they don't know anything about what goes inside of it. And so and and the hyperscalers, on the other hand, have been really building towards traditional cloud services. I mean, if you look at a modern region in any of the clouds, they have hundreds of services. Speaker 158:25 - 58:43
去找个 GC。快去干吧。而他们对里面实际要装什么根本不了解。另一方面,hyperscalers 一直以来真正建设的,其实是面向传统 cloud services 的体系。我的意思是,如果你看看任何一家 cloud 里一个现代 region,它们都有几百种服务。
Speaker 158:43 - 59:29
I mean, everything from satellite base stations to tape storage to spinning disk to face recognition APIs. I mean, these are all the services, and each of those services requires a different SKU and has different parameters about what you're kind of servicing. And in fact, you might have somebody who's trying to run an ATM back end on one of these things. That's a pretty different design space and design constraint than an AI data center that could maybe have a lower availability and uptime. And so that's kind of where I think Lambda is able to build a lot of really unique value through this kind of targeted AI first approach. Speaker 158:43 - 59:29
我的意思是,什么都有:从 satellite base stations,到 tape storage,到 spinning disk,再到 face recognition APIs。这些全都是服务,而其中每一种服务都需要不同的 SKU,并且在你所服务的对象上有不同的参数。事实上,你甚至可能会遇到有人想在这种东西上跑一个 ATM back end。那和一个 AI data center 的设计空间、设计约束就非常不一样了;后者也许可以接受更低一些的 availability 和 uptime。所以我认为,Lambda 正是通过这种有针对性的、AI first 的方法,得以构建出很多非常独特的价值。
Speaker 259:29 - 59:37
You had a a quote where you said that AI won't write software. It will become the software. What what do you mean by that? Speaker 259:29 - 59:37
你有一句话,说 AI 不会去写软件,它会成为软件。这句话是什么意思?
Speaker 1 | 59:37 - 1:00:10 So that's in my sort of, like, idea around what I call neural software and or a neural computer, neural operating systems. And the best way to kind of get this experience is to go to your Chad GPT or your Claude and say, Hey, just render for me an ASCII art desktop interface. Okay? So you're working purely in the domain of text. And I want you to just pretend to be an operating system for me.
所以,这大致就是我关于所谓 neural software(神经软件)、或者 neural computer(神经计算机)、neural operating systems(神经操作系统)这一类东西的想法。想体验这种感觉,最好的办法就是去你的 Chad GPT 或 Claude,然后说:嘿,帮我渲染一个 ASCII art 桌面界面。好吧?也就是说,你完全在文本这个域里操作。我希望你就假装自己是一个操作系统。
Speaker 1 | 1:00:10 - 1:00:48 I'm going to say, click on this, open up this, and I want you to just behave like a computer. So give it that prompt. Okay? And what you're going to see, I think, is that you're gonna see that that sort of future of the large language model becoming the software and not generating the software. And this results in an extremely sort of squishy and flexible way of interfacing with a computer where it's not possible to have a bug, only a misunderstanding about the prompt and what you've asked for.
我会说,点这个,打开那个,我要你就像一台电脑那样来响应。就给它这样一个 prompt(提示词)。好吧?然后我觉得你会看到的是:未来那种 large language model(大语言模型)会变成软件本身,而不是去生成软件。而这会带来一种非常“软”、非常灵活的人机交互方式,在这种方式里,严格来说不存在 bug(漏洞/程序错误),只有你给的 prompt 和你想表达的需求之间出现了误解。
Speaker 1 | 1:00:48 - 1:01:30 And I think that for a lot of the pieces of software on your computer, you might see that taking over where you know, you can get the glimpse of the future with this ASCII art, and then eventually, it'll also have a multimodal network that's generating every pixel on your screen, as well as every audio waveform that comes out of your speakers. The advantage to this is that you can really sort of dream up software. Only the part that is being experienced by you and is actually implemented, right, if that makes sense. It could have whatever feature it is if you ask it. And that's a really powerful way to interact with a computer, I think.
而且我觉得,你电脑上的很多软件部件,可能都会被这种方式接管。你现在可以先通过这种 ASCII art 窥见未来;再往后,它还会有一个 multimodal network(多模态网络),来生成你屏幕上的每一个像素,以及你扬声器里输出的每一段 audio waveform(音频波形)。这样做的好处是,你基本上可以把软件“想”出来。真正被实现的,只有你正在体验的那一部分,对吧,如果这样说你能理解的话。只要你提出要求,它就可以具备任何你想要的功能。我觉得这会是一种非常强大的人机交互方式。
Speaker 2 | 1:01:30 - 1:01:36 So it's not like you give simple instructions to the LLM, suddenly the LLM is the software.
所以并不是说你给 LLM 一些简单指令,LLM 突然就成了软件。
Speaker 1 | 1:01:36 - 1:01:58 I guess, make the analogy, like, you know, vibe coding takes in a prompt and then outputs human readable, writable, compilable code that runs on normal human software programming language substrate. Right? You know? It outputs c code, which gets put through a compiler. It outputs Python code, which gets put through a Python interpreter.
我想,类比一下吧。比如说,vibe coding 会接收一个 prompt,然后输出人类可读、可写、可编译的代码,而这些代码运行在普通的人类软件编程语言 substrate(底层基础)之上。对吧?你知道的。它输出的是 C code,然后交给 compiler(编译器)处理;它输出的是 Python code,然后交给 Python interpreter(解释器)处理。
Speaker 1 | 1:01:59 - 1:02:16 That software is static. Once it's been generated, it can't change. Right? You could vibe code it again and maybe, you know, vibe code on the fly. There's, like, a couple different stages of the gradient between traditional human written software, and then you go to, like, maybe vibe coded software.
那种软件是静态的。一旦生成出来,它就不能再变了。对吧?你当然可以再 vibe code 一次,也许还能做到某种 on the fly(动态进行的)vibe code。这里面其实有好几个不同的阶段,构成了从传统的人类手写软件,到比如说 vibe coded software 之间的一条渐变谱系。
Speaker 1 | 1:02:16 - 1:02:21 Then you go to just in time vibe coded where it's a live creation of the software application.
再往后,就是 just in time vibe coded,也就是软件应用是在实时地被创建出来。
Speaker 2 | 1:02:21 - 1:02:22 But it's still software.
但它依然还是软件。
Speaker 1 | 1:02:22 - 1:02:53 But still software. But then you go to the next step, which is just you're interacting with the LLM, and it is emulating how software might behave. And that's the difference between Vibe coding and a neural operating system or neural software. Neural software, there is no code that's running. It's just modifications of the feature activation space and the context in the mind of the neural network.
但那仍然是 software。然后再往下一步走,就是你在和 LLM 交互,而它是在模拟 software 可能会如何运行。这就是 Vibe coding 和 neural operating system 或 neural software 之间的区别。对于 neural software,并没有正在运行的 code;有的只是特征激活空间的变化,以及 neural network“心智”中的上下文变化。
Speaker 2 | 1:02:53 - 1:02:57 And how far do you think we are from that? Is that something
你觉得我们距离那种状态还有多远?那是不是某种——
Speaker 1 | 1:02:57 - 1:03:01 I mean, we have prototypes of it today. So we have prototypes of it today.
我的意思是,我们今天已经有它的 prototype 了。所以我们今天已经有 prototype 了。
Speaker 2 | 1:03:01 - 1:03:04 And when you say you, is that is that Lambda, or is that is that others?
你说“你们”的时候,指的是 Lambda,还是也包括其他人?
Speaker 1 | 1:03:04 - 1:03:31 Lambda's developed a prototype. There are multiple other companies that develop prototypes of this. There's academic research that is has, you know, outlined what this might look like. And how far are we from mass adoption? I would say that, generally speaking, when I'm early on something, I tend to be about a decade to a decade and a half early.
Lambda 已经开发出了一个 prototype。还有多家公司也开发了这方面的 prototype。学术研究里也已经勾勒出了它可能会是什么样子。至于我们距离大规模采用还有多远?我会说,一般来讲,如果我在某件事上看得比较早,我通常会早大约十年到十五年。
Speaker 1 | 1:03:31 - 1:04:08 So I would say that between a decade and fifteen years, we will see mass adoption beginning or otherwise happening for neural software. I mean, already have it. So here's another example, by the way. You already have so you can think of a Tesla self driving car or any type of end to end neural network and large model that's doing autonomy as a form of neural software. And people understand that aspect, which is it's seeing video, it's making decisions about what to output.
所以我会说,在未来十年到十五年之间,我们会看到 neural software 开始进入大规模采用,或者已经发生大规模采用。我的意思是,其实我们已经有了。顺便再举个例子。你已经有了——比如说,你可以把 Tesla 的自动驾驶汽车,或者任何一种执行 autonomy(自主性)的端到端 neural network 和大型模型,看作是一种 neural software。人们已经理解了那一面:它在看视频,在决定要输出什么。
Speaker 1 | 1:04:09 - 1:04:25 Now, the user experience is the driving experience. That said, that is an example of neural software, I would I would argue. And so we already see that today. Now the question is when is your everyone's computers gonna adopt that? I'd say a decade.
现在,user experience 就是驾驶体验。话虽如此,我认为那就是一个 neural software 的例子。所以我们今天其实已经看到了。现在的问题是,什么时候你、什么时候每个人的电脑都会采用它?我会说,大约十年。
Speaker 2 | 1:04:25 - 1:04:32 Do agents change anything from your perspective as a compute provider? And if so, in Huawei?
从你作为 compute provider 的角度看,agents 会改变什么吗?如果会的话,在 Huawei 那边也是如此吗?
Speaker 1 | 1:04:32 - 1:05:09 To understand what needs to change on the computer, we understand need to understand what's changing with the user. So when you're doing vibe coding with agents, of the things you'll notice is that your wall clock time, you know, in in the world is mostly spent on running tests, gathering data, searching through code base. A lot of the time is spent act not just inferencing a neural network, but it's actually spent doing other things. And it's actually very much not it's very much similar to how software engineers spend some of their time. Right?
要理解计算机需要发生什么变化,我们就得先理解用户正在发生什么变化。所以,当你在用 agent 做 vibe coding 时,你会注意到一件事:你的 wall clock time,也就是现实世界里的实际耗时,主要花在跑测试、收集数据、搜索 code base 上。很多时间花费并不只是神经网络做 inferencing(推理),而是实际上花在其他事情上。而这其实和 software engineer 平时花时间的方式非常相似,对吧?
Speaker 1 | 1:05:09 - 1:05:24 You know the old x k c d cartoon of compiling where they're sword fighting on the office chairs. And someone says, what are you guys doing? Compiling. And so so now there's a bunch of time spent compiling. There's a bunch of time spent running tests.
你知道那个老的 xkcd 漫画吧,说的是 compiling 的时候,大家坐在办公室椅子上像在斗剑。有人问,你们在干嘛?回答是:compiling。所以现在也有大量时间花在 compiling 上,也有大量时间花在跑测试上。
Speaker 1 | 1:05:24 - 1:06:13 Because the part of the way that that the agent twenty four seven loops really work well is when you are constantly banging against a nice suite of automated tests to make sure that the code you're writing is good. And so, well, what does that mean? It means that every single cloud service needs to start doing a lot more traditional CPU workloads. They need to do focus on a great environment, a secure environment to host your Claude code instance on. And then you need to think about security from the perspective of you need to think about how this massive influx of new applications are gonna be secured.
因为 agent 这种 24/7 循环之所以能真正高效运作,其中一个关键,就是它会持续不断地和一套完善的 automated tests(自动化测试)对撞,以确保你写出来的代码是好的。那么,这意味着什么?这意味着每一个 cloud service 都需要开始承载更多传统的 CPU workloads(CPU 工作负载)。它们需要重点提供一个优秀的环境、一个安全的环境,来托管你的 Claude Code instance。然后你还得从安全的角度去思考:这股海量新应用涌入之后,要如何把它们安全地保护起来。
Speaker 2 | 1:06:13 - 1:06:17 How do you use AI agents internally?
你们内部是怎么使用 AI agents 的?
Speaker 1 | 1:06:17 - 1:06:35 Well, I mean, a lot of the engineers at Lambda are already, you know, doing a fully agent driven workflow. I mean, if you just go to Claude Code and say, hey. Use advanced workflows or, you know, spin up agents. You can do that. So that's like step one.
嗯,我的意思是,Lambda 的很多工程师已经在采用一种完全由 agent 驱动的 workflow(工作流)了。我的意思是,如果你直接去 Claude Code 里说,嘿,使用 advanced workflows,或者说,启动一些 agents,你就可以这么做。所以这算是第一步。
Speaker 1 | 1:06:35 - 1:07:27 I've demoed internally, and some folks have adopted what I kind of call self assembling software. And so self assembling software is this idea where you, you know, kind of tie in to a twenty four seven running agent fleet product requirements and constant user feedback that's coming off of the system. So you have a very clear and tight loop to go from submitting, hey, this is a bug or this is a feature request, and there's a fleet of agents who are implementing that live for you. Okay? And that sort of cycle I call self assembling software is because you say, hey, this is what this software is for, but most of the development for it is going to happen after the software is launched and the users start to interact with it and customize it for themselves collectively.
我在内部做过演示,也有一些同事已经采用了我称之为 self assembling software(自组装软件)的方法。所谓 self assembling software,就是这样一个思路:你把产品需求,以及系统里持续产生的用户反馈,接到一个 24/7 持续运行的 agent fleet(agent 集群)上。这样你就有了一个非常清晰、非常紧密的闭环:从你提交“嘿,这是个 bug”或者“这是个 feature request(功能请求)”,到有一组 agents 实时为你实现它。明白吗?我把这个循环叫作 self assembling software,是因为你先定义“这个软件是用来做什么的”,但它的大部分开发其实会发生在软件上线之后,当用户开始和它交互,并且以集体方式为自己定制它的时候。
Speaker 1 | 1:07:27 - 1:07:49 And I think that that is maybe the future paradigm of where a lot of the agent driven development is gonna go towards. The other side of that, eventually, once the models get smarter, I think that they're quite not not quite there yet. But you know, tying that back into, hey. I need help. And I'm not talking about the human.
我认为,这也许就是未来很多 agent 驱动开发会走向的一种范式。另一个方向是,最终等模型变得更聪明之后——我觉得它们现在还没有完全到那个程度——不过,把这个再接回到“嘿,我需要帮助”这件事上。而且我说的不是人类。
Speaker 1 | 1:07:49 - 1:08:04 I'm saying the agents going, hey. I need a human to help me. Like, I need you to plug in a thousand GPUs for me, or I need you to give me an API key to a particular service. I need you go sign up for something for me. Can you go please negotiate this?
我的意思是 agent 会说:“嘿,我需要一个人类来帮我。”比如,“我需要你替我接上一千块 GPU”,或者“我需要你给我某个特定服务的 API key”,或者“我需要你替我去注册某个东西。你能不能帮我去谈一下这个?”
Speaker 1 | 1:08:04 - 1:08:18 And I think that that's actually how you're gonna start to see it happen, which is product user feedback gets influenced by the agents. The agents then also ask the people at the company to go and do things for them, all in the service of, you know, delighting customers and making money.
我觉得这其实会是你开始看到它发生的方式:产品用户反馈会受到 agents 的影响。然后 agents 也会去要求公司里的人替它们做事,这一切都是为了,怎么说呢,让客户更满意,同时赚到钱。
Speaker 2 | 1:08:18 - 1:08:32 You've talked about gigawatt scale factories. Is that what you were describing earlier around, like, setting up beginning super good at, creating data centers very quickly, but it's also making them bigger. What what is that concept?
你提到过 gigawatt scale factories。那是不是你刚才说的那个意思,比如说,一开始先非常擅长极快地建设 data centers,但同时也把它们做得更大。这个概念到底是什么?
Speaker 1 | 1:08:32 - 1:08:57 It's an AI factory, which is a basically, land data center servers inside that is generating tokens. And a gigawatt scale means that it's consuming a thousand megawatts or a billion watts, which, is a lot of power. Sort of like maybe you can think of it for context. New York City is something like five gigawatts.
那是一种 AI factory,基本上就是一块地、一座 data center,里面放着 servers,而它在生成 tokens。所谓 gigawatt scale,意思是它消耗一千 megawatts,也就是十亿 watts,这个电力量是非常大的。你也许可以这样理解作参考:New York City 的用电规模大概是五个 gigawatts。
Speaker 2 | 1:08:57 - 1:09:04 You also talked about one person, one GPU. Is that your your vision for the future? Unpack that for us.
你还提到过 one person, one GPU。这是你对未来的愿景吗?给我们展开讲讲。
Speaker 1 | 1:09:05 - 1:09:47 So, you know, before people really believed in the AI thesis, I When I was pitching our series B and C, I would kind of talk a lot about the similarities between, let's say, the computer industry and the AI industry. I really felt like AI was forming a set of generational companies, and that there was gonna be a set of generational companies that got minted with the changes that were coming with AI. And this is, like, in 2020, 2021. And if you read about the history of Apple, for example, in the early days, the motto and the the sort of the credo at Apple was one person, one computer. One person, one computer.
所以,在人们真正相信 AI thesis 之前,我在做我们 series B 和 C 融资路演时,经常会讲计算机产业和 AI 产业之间的一些相似之处。我当时真的觉得,AI 正在孕育一批跨世代的公司,而且随着 AI 带来的变化,会有一批这样的跨世代公司被造就出来。那还是在 2020、2021 年。如果你去读 Apple 的历史,比如在早期,Apple 的口号、或者说它的信条,就是 one person, one computer。One person, one computer。
Speaker 1 | 1:09:48 - 1:10:14 And, you know, there's a sense of humility that's embedded in this one person, one GPU, which is the one person, one computer, you think about how visionary Steve Jobs was. That was Apple was founded in 1976 or something like this. The Macintosh came out in 1985 or 1984, excuse me. 1984, you know, whatever. Eight years or so after founding.
而且你知道,one person, one GPU 这个说法里其实带着一种谦逊,因为它对应的是 one person, one computer。你想想 Steve Jobs 当年多有远见。Apple 大概是在 1976 年创立的。Macintosh 是在 1985 年推出的,哦不对,是 1984 年。1984 年,怎么说呢,离创立大概八年左右。
Speaker 1 | 1:10:14 - 1:10:18 Is that one person, one computer yet? No. Not even close. Alright. So 1984 to 1994.
那时候做到 one person, one computer 了吗?没有。差得还远。好,我们接着看。1984 到 1994。
Speaker 1 | 1:10:19 - 1:10:30 Alright. Well, is it one person, one computer? Well, we're just starting to have the Internet boom. So we're we're we're I mean, not quite there yet. 2004, we finally have broadband Internet access.
好,那么到那时是 one person, one computer 了吗?嗯,Internet boom 才刚刚开始。所以我们,我们,我的意思是,还不算真正做到。到了 2004 年,我们终于有了 broadband Internet access。
Speaker 1 | 1:10:30 - 1:10:57 And maybe for the first time in The United States, there's not quite one person, one computer, but there's certainly, like, one person, one family, one computer, you know, or, you know, something like this. It's, like, getting close to it. You don't have until 2014. So '74, '84, '94, 2004, 2014. Forty years after one person, one computer do you have probably truly one person, one computer.
也许这是 The United States 第一次还不完全做到 one person, one computer(每人一台电脑),但至少已经很接近了,比如 one person, one family, one computer,或者类似这种状态。就是说,已经开始逼近了。而这并不是到 2014 年之前就实现的。1974、1984、1994、2004、2014。大概要到 “one person, one computer” 这个愿景提出 40 年之后,你才可能真正实现每人一台电脑。
Speaker 1 | 1:10:57 - 1:12:01 And you get actually beyond one person, one computer because people have laptops and cell phones, and I would consider a cell phone a computer. So and then and then finally, you don't even have ecommerce penetration until 2024, fifty years after the founding or so of Apple Computer when ecommerce starts to actually penetrate because of COVID. I think that the the reason I really wanted to choose that one person, one GPU is because, one, I believe that in the future, everybody in The United States will need the computational power of one GPU or more to just do their daily work, enjoy life, whether it's getting access to whether it's getting entertained, whether it's being productive, whether it's being creative. And I also recognize that it took Steve Jobs and Apple, one of the best companies in the history of capitalism, half a century to accomplish their goal. And so I think this is not it's not just like an overnight, let's quickly get to one person on GPU.
而且实际情况甚至已经超越了 one person, one computer,因为人们既有 laptop,也有 cell phone,而我会把 cell phone 也算作 computer。再往后,甚至连 ecommerce penetration(电商渗透)都要到 2024 年才真正出现,也就是大约在 Apple Computer 创立 50 年之后,ecommerce 才因为 COVID 开始真正渗透开来。我之所以特别想选用 one person, one GPU(每人一个 GPU)这个说法,第一,是因为我相信未来 The United States 的每个人都需要一个 GPU 或更多的计算能力,来完成日常工作、享受生活,不管是获取信息、获得娱乐、提升生产力,还是进行创造。我也意识到,Steve Jobs 和 Apple——资本主义历史上最优秀的公司之一——用了半个世纪才实现他们的目标。所以我认为,这件事不是那种一夜之间、很快就能做到让每个人都拥有 GPU 的事情。
Speaker 1 | 1:12:01 - 1:12:04 So that's that's that's what that means to me.
所以,这就是它对我来说的含义。
Speaker 2 | 1:12:04 - 1:12:07 To close, are you ready for a couple of quick hot takes?
最后收个尾,你准备好来几个简短的 hot takes(犀利观点)了吗?
Speaker 1 | 1:12:08 - 1:12:08 Sure.
当然。
Speaker 2 | 1:12:08 - 1:12:13 What is one idea in AI that is overhyped?
在 AI 里,哪一个想法被过度炒作了?
Speaker 1 | 1:12:13 - 1:12:47 I think a lot of the sort of agentic workflows for things that are not software engineering, I think, tend to be overhyped. And I'll tell you that the reason for that is because one of the ways that you get an agentic work workflow working really well is that it needs to have very concrete feedback mechanisms, which are done brilliantly through automated testing. It's not at all done brilliantly for going by a site. There's no traction traction to give a model to go and iterate over a long period of time on. So I think agentic workflows for things that aren't readily verifiable.
我觉得,很多那种用于非 software engineering(软件工程)场景的 agentic workflows(agent 式工作流),往往都被过度炒作了。原因是这样的:让一个 agentic workflow 真正运转得很好的方法之一,是它必须有非常具体的反馈机制,而这一点可以通过 automated testing(自动化测试)非常出色地实现。但如果是去浏览一个网站之类的任务,这种机制就完全谈不上做得很好。没有足够的 traction(可供推进和反馈的抓手)让模型长期迭代下去。所以我认为,那些不容易被直接验证的任务上的 agentic workflows——
Speaker 1 | 1:12:47 - 1:13:13 Now I wouldn't say as far as everything that's not software engineering because there's plenty of readily verifiable fields. CAD, computer aided manufacturing, finite element analysis, computational fluid dynamics. There's a bunch of fields where you can really do a great agentic workflow, and simulate it, and then go and iterate. It's not the case for, hey, Claude, make me a billion dollars. Make no mistakes.
当然,我也不是说凡是不是 software engineering 的都不行,因为有很多领域本身就是很容易验证的。CAD、computer aided manufacturing(计算机辅助制造)、finite element analysis(有限元分析)、computational fluid dynamics(计算流体力学)。有一大批领域里,你真的可以做出非常优秀的 agentic workflow,先模拟,再不断迭代。但那和“嘿,Claude,帮我赚 10 亿美元,而且别出任何错”不是一回事。
Speaker 1 | 1:13:13 - 1:13:13 You know?
你知道吧?
Speaker 2 | 1:13:14 - 1:13:17 Sadly. Or maybe not. Okay. Fascinating.
可惜的是。或者也未必。好吧。真有意思。
Speaker 1 | 1:13:17 - 1:13:24 What Inflation would be well, it wouldn't be inflation. It would actually be just value creation of the economy. Deflationary even.
通胀会是什么样?嗯,那其实不算是通胀。那实际上只是经济的价值创造。甚至会是通缩性的。
Speaker 2 | 1:13:24 - 1:13:27 What is one idea in AI that is underrated?
AI 里有什么被低估的想法?
Speaker 1 | 1:13:27 - 1:13:45 Yeah. I I really think that the neural OS thing and, you know, also some of the aspects of self assembling software. Like, I still do think people know, the funny thing is I'll give the same answer. Agent based workflows for software development. I think that most people don't understand.
对。我真的认为 neural OS 这个东西,以及,你知道,还有 self assembling software(自组装软件)的一些方面。比如,我还是觉得大家知道,好笑的是我会给出同样的答案。基于 agent 的软件开发工作流。我觉得大多数人并不理解。
Speaker 1 | 1:13:45 - 1:13:59 They literally don't understand because they've never tried it. They've never gone to Claude. Go to Claude, go say maximum effort, use the latest model, then go and build whatever you wanted to build and say, you know, spin up 10 agents to go and do it. I think a lot of people still haven't done it yet.
他们是真的不理解,因为他们从来没试过。他们从来没有去过 Claude。去 Claude,进去之后说 maximum effort,使用最新的 model,然后去构建任何你想构建的东西,再说,你知道,启动 10 个 agents 去做。我觉得很多人到现在都还没这么做过。
Speaker 2 | 1:13:59 - 1:14:04 Well, Stephen, it's been wonderful. Thank you so much for spending time with us.
好吧,Stephen,今天非常愉快。非常感谢你花时间和我们交流。
Speaker 1 | 1:14:04 - 1:14:05 Matt, thank you so much for having me.
Matt,也非常感谢你邀请我。
Speaker 2 | 1:14:05 - 1:14:25 Appreciate it. Hi, it's Matt Turk again. Thanks for listening to this episode of the MAD Podcast. If you enjoyed it, we'd be very grateful if you would consider subscribing if you haven't already or leaving a positive review or comment on whichever platform you're watching this or listening to this episode from. This really helps us build a podcast and get great guests.
感谢支持。大家好,我是 Matt Turk。感谢收听这一期 MAD Podcast。如果你喜欢这一期内容,而你还没有订阅的话,我们会非常感激你考虑订阅;或者也欢迎你在你观看或收听这一期节目的平台上留下积极的评价或评论。这真的很有助于我们把这个 podcast 做起来,并邀请到很棒的嘉宾。
Speaker 2 | 1:14:25 - 1:14:27 Thanks and see you on the next episode!
谢谢,我们下期节目见!