挑灯看剑 (@chillring) 在 最近的一篇博文不知道大家看过没,在质疑Scaling Laws的曲线有一个bug 中发帖
[Diogo Almeida]
这个人是当年在OPENAI做大模型优化的,
他昨天发了一个博文, Scaling Laws, Honestly
直接说最开始那个版本的Scaling Laws,是错误的,有BUG的
[image]
这个其实还好,只是一个学术上的小问题
但是另外一个研究者作出的推论就非常好玩了
the comment on the original blog post, was even more interesting. didn’t realize that the nature of language can have effects on scaling laws as well (for ex. the comment mentioned a model with the same arch. but trained on french got ...