“But LLMs are great for summarizing”
(No, they fundamentally fail.)
https://ea.rna.nl/2024/05/27/when-chatgpt-summarises-it-actually-does-nothing-of-the-kind/
“But LLMs are great for summarizing”
(No, they fundamentally fail.)
https://ea.rna.nl/2024/05/27/when-chatgpt-summarises-it-actually-does-nothing-of-the-kind/
@Gergovie @clive @thomasfuchs I think that's way too reductive. LLMs absolutely do something that *looks* like understanding and reasoning.
The problem is that we don't have great ways to characterize what it is they *do*, so it's really hard to know when their output is good enough to use in place of actual logic and interpretation.
@acjay @Gergovie @clive mimicking understanding and reasoning isn't understanding and reasoning
They have NO understanding NOR reasonning.
Only text generator.
Yeah they really are not trustworthy in this regard
It was one of the things I actually hope they would do well!
But I haven’t had much luck with it and research metrics like this haven’t either
@clive @acjay @Gergovie In other words, the process of thinking, reasoning and creativity is inherent to produce sensical writing and can't just be replaced by statistical analysis.
Yeah, that’s it
@clive @acjay @Gergovie This is provably shown by that LLM output (as in the referenced article), in all cases, produces bullshit (as in Harry Frankfurt's definition[1]).
@thomasfuchs @clive @Gergovie A similar argument could be made to debunk the notion that the human brain is capable of actual thinking. After all, it's just a bunch of neurons, preconfigured by genetics, trained on sensory data.
To be clear, I don't think that LLMs "think" in the exact way as humans, but I do believe there's a very fuzzy boundary.
@clive @Gergovie @acjay Bait and switch only works when the product actually somewhat does what your marketing says it does. But generative AI doesn’t.
It can definitely be useful in a bunch of areas for sure
I do wonder what’ll happen in a year or so from now — the enormous expense of training and inferencing on the foundation models doesn’t seem likely to produce profits anywhere close to recouping, to say nothing of 10xing
I suspect there’ll be some hard conversations
@clive @thomasfuchs @Gergovie It reminds me of Prolog a bit. When I first learned it, I was like "holy shit, this is incredible". But then you learn the fundamental limitations, and how the workarounds to those limitations undermine all the good parts. Then you understand why it remains a niche technology.
It's possible we're already pretty close to the local maximum of LLMs as a technology. If so, I still do think it's pretty impressive.
@clive @thomasfuchs @Gergovie I think we pretty much agree. It's mimicry of those things. It's extremely unclear that you can even compose LLMs with other subsystems in a rigorous way to address those shortcomings.
@Gergovie @clive @thomasfuchs But because LLMs are so internally complex, we're reduced to discussing them by analogy, and I think that chronically leads to over- and underestimating their utility.
I think over and underestimating is a good way of putting it
I’m not as confident as you that the statistical approach that underpins LLMs produces anything like what we could reasonably call understanding, though
It may well be a *component* of understanding — making associations is key — but it’s not all clear that it can produce other elements of reasoning: logic, math, semantics, etc
@Gergovie @clive @thomasfuchs The text that LLMs are trained on are an artifact of understanding and reasoning processes. And to the extent that the text outputs can capture the essence of those processes, LLMs mimic the processes themselves.
@clive @Gergovie @acjay Scams of this type all eventually collapse.
though to be fair “bacon topped ice cream“ is something McDonald’s probably should in reality have on the menu
GNU social JP is a social network, courtesy of GNU social JP管理人. It runs on GNU social, version 2.0.2-dev, available under the GNU Affero General Public License.
All GNU social JP content and data are available under the Creative Commons Attribution 3.0 license.