While I hesitate to encourage folk to run unnecessary queries, I do think the "made up phrase" + "meaning" google search is a great exercise in helping people understand what generative AI actually does.. I've found it hard to explain to folk outside tech that its purpose is literally just to generate text but this seems a more intuitive demo that might make them question the output.
@tomknuf@jn@sue “Worked” is really a stretch here. The fundamental thing here isn’t just the quality of the answer (yes, no wolf is better); is that the thing doesn’t understand what a boat is, or what crossing is, or what a river is, because it just doesn’t •understand• anything period. That is does a better job than before of coherently repeating textual patterns doesn’t fundamentally move it away from the fact that it can’t think about the problem; it can only copy the answer to a seemingly related question.
@jn@inthehands@sue Yeah I just tested this on Gemini 2.0 Flash, 2.5 Flash, and 2.5 Pro, and none of them really suffer from this, though 2.0 Flash wanted to take a guess. I suspect the one on Google Search is a very low cost model.
@elCelio@tomknuf@jn@sue I mean, yes: I think a lot of people have got through life by bullshitting everything, and that’s why they truly don’t understand why a bullshitting LLM can’t just do everyone’s job now.
@steve@chocobo13@FrChazzz@sue Ironically, I have some kind of weird sensory disorder that leaves me averse to the smell of all fruit — including fresh tomatoes, eggplant, cucumber, and peppers. So this discussion really hits close to home for me!
(P.S. No worries about discussing this all in my mentions, just as long as I don’t have to smell any of it.)
@steve@inthehands@sue This inspires me to share my favorite definition of wisdom I've ever received (thanks to a sixth grader from when I was a teacher):
Knowledge is knowing that a tomato is a fruit. Wisdom is knowing that a tomato doesn't belong in a fruit salad.
@FrChazzz I've been thinking that there probably exists a really good savory technically-fruit salad involving tomatoes. Maybe eggplant as well? @steve@inthehands@sue
@thewhite969@inthehands@paninid I mean it may be both. I wouldn't be surprised if their "manual intervention" consists of having someone write a bunch of explainers and feeding them into corpus with high weights.
What's telling is that they'll do this when the model embarrasses them with how obviously fake the impression of "intelligence" is, but not when the bot is vomiting up hate speech.
@inthehands@paninid It may be manuel intervention, because it's a common fault. Or it might be, that it's a common fault and so there's much content on the Internet laughing about this. And the training data of the LLM are newer and contain this content. And in some magical way this leads to the LLM scoring this combination of words as more relevant to the question. The important thing in the end is: The LLM doesn't _understand_ anything.
@inthehands@jn@sue I think you misunderstood me. I meant "worked" in the sense that it still isn't able to answer this adequately, opposed to the other replier that LLMs nowadaws don't make such mistakes anymore.
My suspicion is that they do actually attempt the whack-a-mole with hate speech, which keeps generating bad press, but there’s just •so• much of it in the training data