This is a super exciting release - Claude Fable 5 is the same underlying model as Mythos but with added safeguards. The benchmarks are great and it's SOTA on everything by a margin but I'll add that *qualitatively* also, this is a major-version-bump-deserving step change forward (imo of the same order as Claude 4.5 was in November), peaking especially for long problem-solving sessions on very difficult problems. You can give it a lot more ambitious tasks than what you're used to, the model "gets it" and it will just go, and it's never felt this tempting to stop looking at the code at all (but don't do this in prod!). The model still has quirks that people will run into and the safeguards are configured to be a little too trigger happy for launch, which can hopefully be tuned over time. I feel a lot of things changing as working software increasingly comes out on a tap. The Jevon's paradox kicks in and I feel my own demand for software growing substantially. You can ask for anything - explainers, visualizers, dashboards, bespoke single-use apps (e.g. a full wandb that is hyper-specific just for your project), you can 10X your test suite, auto-optimize code, run giant research projects with custom HTML for the results, anything! "Free your mind" (Matrix ref). Really looking forward to all the things people build!
这是一次超级令人兴奋的发布——Claude Fable 5 与 Mythos 使用的是同一个底层模型,只是加入了额外的 safeguards(安全防护)。各项 benchmark(基准测试)表现非常出色,几乎在所有项目上都以明显优势达到 SOTA(state of the art,当前最先进水平);但我还想补充一点:就 *qualitatively*(定性体验)而言,这也是一次足以称得上主版本升级的跃迁式进步(我认为其量级和 Claude 4.5 在 11 月那次相当),尤其是在针对极难问题进行长时间问题求解时表现最突出。你可以交给它比自己以往习惯的更有野心的任务,模型会“懂你的意思”,然后直接开干,而且我从来没有像现在这样强烈地想彻底不再看代码了(但在 prod(生产环境)里千万别这么做!)。这个模型仍然有一些用户会碰到的 quirks(小毛病/怪癖),而且这些 safeguards 在发布时的配置也有点过于容易触发,不过希望之后能随着时间逐步调优。我能感觉到很多事情都在变化:可用的软件正越来越像拧开水龙头一样随取随得。Jevon's paradox(杰文斯悖论)开始生效了,我也感觉自己对软件的需求正在大幅增长。你几乎可以要求任何东西——explainer(讲解器)、visualizer(可视化工具)、dashboard(仪表盘)、按需定制的一次性 app(应用)(比如一个完整的 wandb,但超高度针对、只服务于你的项目),你可以把自己的 test suite(测试套件)提升 10X,自动优化代码,用定制 HTML 输出结果来跑大型研究项目,什么都可以!“Free your mind”(出自 Matrix)。真的非常期待大家会做出些什么!