GNU social JP
  • FAQ
  • Login
GNU social JPは日本のGNU socialサーバーです。
Usage/ToS/admin/test/Pleroma FE
  • Public

    • Public
    • Network
    • Groups
    • Featured
    • Popular
    • People

Notices by Nick Byrd (byrdnick@nerdculture.de)

  1. Embed this notice
    Nick Byrd (byrdnick@nerdculture.de)'s status on Friday, 18-Apr-2025 09:54:52 JST Nick Byrd Nick Byrd

    Most #LLMs over-generalized scientific results beyond the original articles

    ...even when explicitly prompted for accuracy!

    The #AI was 5x worse than humans, on average!

    Newer models were the worst.🤦♂️

    🔓 Accepted in #RoyalSociety Open #Science: https://doi.org/10.48550/arXiv.2504.00025

    In conversation about a month ago from nerdculture.de permalink

    Attachments


    1. https://nerdculture.de/system/media_attachments/files/114/354/538/763/194/441/original/2761776820dc6283.png

    2. https://nerdculture.de/system/media_attachments/files/114/354/538/766/411/596/original/fe1aa31155bac050.png
    3. Domain not in remote thumbnail source whitelist: arxiv.org
      Generalization Bias in Large Language Model Summarization of Scientific Research
      Artificial intelligence chatbots driven by large language models (LLMs) have the potential to increase public science literacy and support scientific research, as they can quickly summarize complex scientific information in accessible terms. However, when summarizing scientific texts, LLMs may omit details that limit the scope of research conclusions, leading to generalizations of results broader than warranted by the original study. We tested 10 prominent LLMs, including ChatGPT-4o, ChatGPT-4.5, DeepSeek, LLaMA 3.3 70B, and Claude 3.7 Sonnet, comparing 4900 LLM-generated summaries to their original scientific texts. Even when explicitly prompted for accuracy, most LLMs produced broader generalizations of scientific results than those in the original texts, with DeepSeek, ChatGPT-4o, and LLaMA 3.3 70B overgeneralizing in 26 to 73% of cases. In a direct comparison of LLM-generated and human-authored science summaries, LLM summaries were nearly five times more likely to contain broad generalizations (OR = 4.85, 95% CI [3.06, 7.70]). Notably, newer models tended to perform worse in generalization accuracy than earlier ones. Our results indicate a strong bias in many widely used LLMs towards overgeneralizing scientific conclusions, posing a significant risk of large-scale misinterpretations of research findings. We highlight potential mitigation strategies, including lowering LLM temperature settings and benchmarking LLMs for generalization accuracy.
  2. Embed this notice
    Nick Byrd (byrdnick@nerdculture.de)'s status on Sunday, 30-Mar-2025 19:17:33 JST Nick Byrd Nick Byrd

    Overheard at a conference about #AI in #Medicine:

    Speaker: "I hear neurologists prefer we say that generative AI systems 'confabulate' and not that they 'hallucinate'."

    Neurologist [shouting from the back of the room]: "CORRECT!"

    #psychiatry #neuroscience #sciComm #edu

    In conversation about 2 months ago from nerdculture.de permalink

    Attachments


    1. https://nerdculture.de/system/media_attachments/files/114/235/056/717/858/390/original/854e0b7962e7abf9.jpeg
  3. Embed this notice
    Nick Byrd (byrdnick@nerdculture.de)'s status on Wednesday, 05-Feb-2025 04:48:31 JST Nick Byrd Nick Byrd
    • Nick Byrd

    Alright nerds,

    What are the *easiest* methods to #repost or #crosspost my #Mastodon posts to #BlueSky (or vice versa)?

    In other words, how can I make my BlueSky account (@byrdnick.com) post whatever I post to this Mastodon account (@ByrdNick)? (Or vice versa?)

    #socialMedia #webhosting #API

    In conversation about 3 months ago from nerdculture.de permalink
  4. Embed this notice
    Nick Byrd (byrdnick@nerdculture.de)'s status on Wednesday, 05-Feb-2025 04:48:27 JST Nick Byrd Nick Byrd
    in reply to

    BlueSky Crossposter™©® worked (after plenty of troubleshooting and some recoding): https://nerdculture.de/@ByrdNick/113454337286905203

    In conversation about 3 months ago from nerdculture.de permalink

    Attachments

    1. No result found on File_thumbnail lookup.
      Nick Byrd, Ph.D. (@ByrdNick@nerdculture.de)
      from Nick Byrd, Ph.D.
      I'm trying out "Bluesky Crossposter™©® developed by denvitadrogen": https://github.com/Linus2punkt0/bluesky-crossposter/tree/main If you're seeing this post show up somewhere other then @bsky.app, then I got it working. Otherwise 😒
  5. Embed this notice
    Nick Byrd (byrdnick@nerdculture.de)'s status on Wednesday, 05-Feb-2025 04:48:23 JST Nick Byrd Nick Byrd
    in reply to

    After #BlueSky Crossposter 👆 stopped working for me, I found #Fedica, which has been crossposting to nearly 10 platforms (for free!):

    https://nerdculture.de/@ByrdNick/113483550862956198

    In conversation about 3 months ago from nerdculture.de permalink
  6. Embed this notice
    Nick Byrd (byrdnick@nerdculture.de)'s status on Thursday, 14-Sep-2023 23:27:03 JST Nick Byrd Nick Byrd

    Remember that "...WEIRDest people in the world" paper?

    Now #xPhi has one: Of "171 experimental philosophy studies [from] 2017 [to] 2023 [including one of mine] most ...tested only Western populations but generalized beyond them without justification."

    Incentives may be part of the issue: "studies with broader conclusions ...had higher citation impact."

    https://doi.org/10.1017/psa.2023.109

    #xPhi #PsychMethods #Culture #Demography #PhilSci

    In conversation Thursday, 14-Sep-2023 23:27:03 JST from nerdculture.de permalink

    Attachments


User actions

    Nick Byrd

    Nick Byrd

    Ph.D. | Using #decisionScience to understand how (to improve the ways) we think (as individuals and groups). At Stevens.edu in #NYC metro area.I use this more for work (#CogSci, #Psychology, #Philosophy, #Rationality, #Debiasing, #Depolarization, #Teaching, #SciComm, and #rStats) than the more personal stuff I post to Instagram (@byrd.nick).Profile picture: not Neil Patrick Harris, but someone that is told almost daily that they bear an uncanny resemblance to him.

    Tags
    • (None)

    Following 0

      Followers 0

        Groups 0

          Statistics

          User ID
          172495
          Member since
          14 Sep 2023
          Notices
          6
          Daily average
          0

          Feeds

          • Atom
          • Help
          • About
          • FAQ
          • TOS
          • Privacy
          • Source
          • Version
          • Contact

          GNU social JP is a social network, courtesy of GNU social JP管理人. It runs on GNU social, version 2.0.2-dev, available under the GNU Affero General Public License.

          Creative Commons Attribution 3.0 All GNU social JP content and data are available under the Creative Commons Attribution 3.0 license.