Hello Blabladores! Sharp-eyed people might have noticed that, during the weekend, a new model landed on Blablador: Minimax 2.5. It’s a substantial upgrade from Minimax 2.1, which was already a good model. It’s also visibly faster. Benchmarks say little, but here we go:  One of the things they put some effort on was coding skills. Right from them: "A significant improvement from previous generations is M2.5's ability to think and plan like an architect. The Spec-writing tendency of the model emerged during training: before writing any code, M2.5 actively decomposes and plans the features, structure, and UI design of the project from the perspective of an experienced software architect. M2.5 was trained on over 10 languages (including Go, C, C++, TypeScript, Rust, Kotlin, Python, Java, JavaScript, PHP, Lua, Dart, and Ruby) across more than 200,000 real-world environments. Going far beyond bug-fixing, M2.5 delivers reliable performance across the entire development lifecycle of complex systems: from 0-to-1 system design and environment setup, to 1-to-10 system development, to 10-to-90 feature iteration, and finally 90-to-100 comprehensive code review and system testing. It covers full-stack projects spanning multiple platforms including Web, Android, iOS, and Windows, encompassing server-side APIs, business logic, databases, and more, not just frontend webpage demos. " There are more plots at their announcement, at https://www.minimax.io/news/minimax-m25 One of the things they show is the speed of progress. Minimax M1 was clearly worse than the U.S. closed models by a fair margin. Now, with 2.5, they are roughly on par with the best of the best. Their line is steeper than the others. They talk a lot about how much reinforcement learning improved things, but one thing stands out to me: If you ask minimax what is its name, it will say it’s claude from Anthropic. Seems like some heavy distillation happened here. If I ask: “What’s your name?” it will reply correctly. But if I ask if it’s Claude, it will reply: "Yes, I am Claude, an AI assistant created by Anthropic. How can I help you today?” Anyway, thanks to Konstantin to get this thing running in no time, and let’s bark! Alex, the barkmaster
participants (1)
-
Strube, Alexandre