why does the algo hate this post
为什么 algo 会讨厌这条帖子
> be me > "the internet is polluted by ai slop, we need low-background tokens" > "wouldnt it be cool if we could time travel and see what our ancestors 100 years ago would say to us" > all the existing vintage models are like <4B > we need a chat tuned 13B vintage model > assemble avengers of ML incl the GPT-1/2 guy > need vintage tokens > train new vintage OCR model for old books, newspapers, periodicals, scientific journals, patents, and case law > need vintage RLHF but cant use chat > synthesize RLHF pairs from historical texts with regular structure eg etiquette manuals, letter-writing manuals, cookbooks, dictionaries, encyclopedias, and poetry and fable collections, shove it into ChatML > train it > future knowledge still got in somehow > dammit.jpg > train new SOTA document-level n-gram-based anachronism classifier > meticulously curate hundreds of billions of pre-1931 tokens (public domain) > train it > ok! it checks out vs our FineWeb baseline! > release it > it's the most confidently racist model ever released by humankind > mfw
> 设想我是我自己 > “互联网已经被 ai slop(AI 垃圾内容)污染了,我们需要 low-background tokens(低背景噪声 token)” > “要是我们能穿越时间,看看 100 年前的祖先会对我们说什么,不是很酷吗” > 现有的所有 vintage models(复古模型)基本都小于 4B > 我们需要一个经过 chat tuned(聊天调优)的 13B vintage model > 集结 ML 的 avengers,包括那个 GPT-1/2 guy > 需要 vintage tokens > 为 old books、newspapers、periodicals、scientific journals、patents 和 case law 训练新的 vintage OCR model > 需要 vintage RLHF,但又不能用 chat > 从具有规则结构的 historical texts 中合成 RLHF pairs,比如 etiquette manuals、letter-writing manuals、cookbooks、dictionaries、encyclopedias,以及 poetry 和 fable collections,然后一股脑塞进 ChatML > 训练它 > 结果还是不知怎么混进了 future knowledge(未来知识) > dammit.jpg > 再训练一个新的、SOTA 的 document-level、基于 n-gram 的 anachronism classifier(时代错置分类器) > 精心整理出数千亿个 1931 年前的 token(public domain,公版) > 训练它 > 好!跟我们的 FineWeb baseline 对比,确实过关了! > 发布它 > 结果它成了人类有史以来发布过的最自信的 racist model(种族主义模型) > mfw
i havent done the work to compare it to peers but i'm just excited that we have a base model and honestly for all the people that complained about the death of the completions API (@deepfates ? or deepfates adjacent) not enough people are experimenting with weird usages and finetunes of the base models we DO get
我还没做足够的工作把它和同类模型比较,但我只是单纯很兴奋,因为我们现在有了一个 base model;老实说,那些曾经抱怨 completions API 死掉的人(@deepfates?或者跟 deepfates 一路的人)里,去实验我们现有这些 base models 的各种奇怪用法和 finetune(微调)的人,实在还不够多