I'm seriously impressed by the quality of the MoE and even 7B models I've run on my 4 year old M1 over the weekend. This is absolutely solid for most of the really useful ways of LLMs I actually use them for day to day.
Already most of these tools allow me to use local models or cheaply hosted versions of them. I can see the total power used in daily inference be about 10 minutes of running 4 cores.
These models and community finetunes are already much more fun than GPT or claude.
1/