@chikim @ZBennoui
I have 256G, but only 20 cores and no GPU, does MoE still help in that case?
Conversation
Notices
-
Embed this notice
Blaise Pabón - controlpl4n3 (blaise@hachyderm.io)'s status on Wednesday, 07-May-2025 01:48:30 JST
Blaise Pabón - controlpl4n3
-
Embed this notice
Chi Kim (chikim@mastodon.social)'s status on Wednesday, 07-May-2025 01:48:31 JST
Chi Kim
@ZBennoui Yea try them on local machine. Especially if you have enough ram, qwen3:30b is really fast because of MoE architecture.
-
Embed this notice
Zach Bennoui (zbennoui@dragonscave.space)'s status on Wednesday, 07-May-2025 01:48:32 JST
Zach Bennoui
@chikim I tried the really big model on the HF demo, too scared to make an account on the QWEN Chat site lol. Haven't yet tried the smaller local versions yet but the big one is really quite good.
-
Embed this notice
Chi Kim (chikim@mastodon.social)'s status on Wednesday, 07-May-2025 01:48:33 JST
Chi Kim
Qwen3 is released right before LlamaCon tomorrow! lol 32K context length, tool calling, a way to turn on/off reasoning with /think /no_think in prompt, 119 languages support. 6 dense models (0.6b, 1.7b, 4b, 8b, 14b, 32b) and 2 MoE models (30b-a3b, 235ba22b). #LLM #AI #ML https://qwenlm.github.io/blog/qwen3/
-
Embed this notice