@obrhoff Yeah the smaller distill models you can locally are great, too, it's just important to understand that you won't get the proper deepseek-r1 everyone is hyped about.
What models to pick for running locally depends on: What you can fit in memory, if you run on cpu or gpu, if you want to optimize for latency e.g. for interactive use cases, and what kind of quality output you're looking for.
Just try a bunch and see what happens 😛