Prediction: LLMs, or Large Language Models, are the brute-force approach to language generation models. They will be replaced by SLMs, or Structured Language Models, which will use human-tuned seed datasets to build models from much less source material than LLMs need.
Conversation
Notices
-
Embed this notice
Louis Ingenthron (louis@ingenthron.social)'s status on Tuesday, 28-Jan-2025 08:31:38 JST Louis Ingenthron
-
Embed this notice
🎓 Doc Freemo :jpf: 🇳🇱 (freemo@qoto.org)'s status on Tuesday, 28-Jan-2025 08:31:36 JST 🎓 Doc Freemo :jpf: 🇳🇱
@louis SLM already exists, it stands for Small Language Model.
But you arent too far off. LLMs are just the language part of the brain, if you somehow could remove someones language centers only and make them play, I bet they would sound like an LLM.. no reason, no logical, its only skill is forming nice looking sentences. What you have yet to see is the rest of the brain hooked up.
-
Embed this notice
🎓 Doc Freemo :jpf: 🇳🇱 (freemo@qoto.org)'s status on Tuesday, 28-Jan-2025 08:49:45 JST 🎓 Doc Freemo :jpf: 🇳🇱
@louis You dont feed it on sylallbals, you feed if it on stemmed words..
so: hyperaldosteronism -> hyper- aldosterone -ism
Basically you separate the prefix and suffix and treat those as sperate words, keep the base word as one worrd.
This is for english, there may be languages where sylable per-idea works , but in english the idea of a word usually comes from just the break I mentioned.
-
Embed this notice
Louis Ingenthron (louis@ingenthron.social)'s status on Tuesday, 28-Jan-2025 08:49:46 JST Louis Ingenthron
@freemo I'm not even saying that. Something simpler.
LLMs typically work in syllables, not whole words. That's good for LLMs, because then it learns patterns of conjugation and pluralization naturally from seeing those patterns used in its training.
But what if you ran it on words, not syllables, and you fed it a well-structured initial dataset (i.e. a dictionary) to seed its initial tokens. It would understand that "humans" and "human" are related words and how.
1/2
-
Embed this notice