Token costs are becoming one of the hottest topics for any enterprise I talk with right now. It’s very bullish for AI in general because it means these systems are being used at a scale that wasn’t contemplated before. It also gives way to another form of differentiation that will emerge for the applied AI layer, which is model routing. As tokens take on a significant amount of the cost of any given workflow, then companies will inevitably want to ensure that their dollars go into the most efficient use of tokens for the particular job at hand. Frontier intelligence will always be relevant at the high end of tasks, like coding, legal and financial analysis, healthcare, and more. And dollars spent here will only go up over time. But, equally, you can peel off individual tasks to lower cost models (whether they’re from open weights vendors or the major labs) and deliver a more efficient end outcome. To do this effectively, the applied AI layer needs to understand the workflows in their domain better than anyone else, and be able to mix and match models to different jobs. If you’re doing document extraction, you need to know which models perform better or worse for any given document type. If you’re legal analysis, you want to know which models perform various types of tasks best. And so on. This will become one of the bigger differentiation points over time. The companies with the best evals, the best ability to route the workloads, and those that have business models directly aligned to customers financial goals, will be in a great position.
Token 成本正迅速成为我现在与几乎所有企业交流时最热门的话题之一。从整体上看,这对 AI 非常 bullish(看涨),因为这意味着这些系统正以此前未曾设想过的规模被使用。这也为 applied AI(应用层 AI)带来了另一种将会出现的差异化形式,也就是 model routing(模型路由)。随着 token 在任何给定 workflow(工作流)的成本中占据相当大的比重,企业必然会希望确保自己的每一美元,都被用于针对当前具体任务最有效率的 token 使用方式。对于编码、法律与金融分析、医疗健康等高端任务,frontier intelligence(前沿智能)始终都会具有相关性。而花在这些场景上的资金只会随着时间推移而不断增加。但与此同时,你也可以把其中的单个任务剥离出来,交给成本更低的模型(无论它们来自 open weights vendors,还是 major labs),从而交付一个整体上效率更高的最终结果。要想高效做到这一点,applied AI 层必须比任何其他人都更理解其所在领域的 workflow,并且能够针对不同任务灵活组合不同模型。如果你在做 document extraction(文档提取),你就需要知道,对于任何给定的文档类型,哪些模型表现更好,哪些更差。如果你在做法律分析,你就会想知道,哪些模型最擅长完成不同类型的任务。其他领域也是如此。随着时间推移,这将成为更重要的差异化因素之一。那些拥有最佳 evals(评测)、最强 workload(工作负载)路由能力,以及其商业模式与客户财务目标直接对齐的公司,将处于非常有利的位置。