table 5 comparison of deepseek distilled models
https://assets.chaos.social/media_attachments/files/113/900/526/392/664/519/original/b0af5ea8b0f56df9.png
@obrhoff It's all open research
https://arxiv.org/search/cs?searchtype=author&query=DeepSeek-AI
For details on deepseek-r1 and the qwen / llama distilled models, see
https://arxiv.org/pdf/2501.12948
for the distilled model benchmark see table 5.
They're qwen / llama model architectures and different compared to their main contribution.
GNU social JP is a social network, courtesy of GNU social JP管理人. It runs on GNU social, version 2.0.2-dev, available under the GNU Affero General Public License.
All GNU social JP content and data are available under the Creative Commons Attribution 3.0 license.