Public
- Public
- Network
- Groups
- Featured
- Popular
- People

Figure 3. Comparisons between the raw proportions of scientific articles and human-authored as well as LLM-generated article summaries that contain generalized conclusions, overall algorithmic overgeneralizations, and specific algorithmic overgeneralizations, presented by text source and test condition. Error bars represent standard errors.

Download link

Figure 3. Comparisons between the raw proportions of scientific articles and human-authored as well as LLM-generated article summaries that contain generalized conclusions, overall algorithmic overgeneralizations, and specific algorithmic overgeneralizations, presented by text source and test condition. Error bars represent standard errors.
https://nerdculture.de/system/media_attachments/files/114/354/538/766/411/596/original/fe1aa31155bac050.png

Notices where this attachment appears

Embed this notice
Nick Byrd (byrdnick@nerdculture.de)'s status on Friday, 18-Apr-2025 09:54:52 JST Nick Byrd

Most #LLMs over-generalized scientific results beyond the original articles
...even when explicitly prompted for accuracy!
The #AI was 5x worse than humans, on average!
Newer models were the worst.🤦♂️
🔓 Accepted in #RoyalSociety Open #Science: https://doi.org/10.48550/arXiv.2504.00025

In conversation about 9 months ago from nerdculture.de permalink