GNU social JP
  • FAQ
  • Login
GNU social JPは日本のGNU socialサーバーです。
Usage/ToS/admin/test/Pleroma FE
  • Public

    • Public
    • Network
    • Groups
    • Featured
    • Popular
    • People

Conversation

Notices

  1. Embed this notice
    Paul Cantrell (inthehands@hachyderm.io)'s status on Tuesday, 10-Feb-2026 01:30:34 JST Paul Cantrell Paul Cantrell

    LLMs have no model of correctness, only typicality. So:

    “How much does it matter if it’s wrong?”

    It’s astonishing how frequently both providers and users of LLM-based services fail to ask this basic question — which I think has a fairly obvious answer in this case, one that the research bears out.

    (Repliers, NB: Research that confirms the seemingly obvious is useful and important, and “I already knew that” is not information that anyone is interested in except you.)

    1/ https://www.404media.co/chatbots-health-medical-advice-study/

    In conversation about a month ago from hachyderm.io permalink

    Attachments

    1. Domain not in remote thumbnail source whitelist: images.unsplash.com
      Chatbots Make Terrible Doctors, New Study Finds
      from @samleecole
      Chatbots provided incorrect, conflicting medical advice, researchers found: “Despite all the hype, AI just isn't ready to take on the role of the physician.”
    • Embed this notice
      Paul Cantrell (inthehands@hachyderm.io)'s status on Tuesday, 10-Feb-2026 01:32:29 JST Paul Cantrell Paul Cantrell
      in reply to

      Despite the obviousness of the larger conclusion (“LLMs don’t give accurate medical advice”), this passage is…if not surprising, exactly, at least really really interesting.

      2/

      In conversation about a month ago permalink

      Attachments


      1. https://media.hachyderm.io/media_attachments/files/116/041/626/055/665/046/original/fd66607f26441bce.png
    • Embed this notice
      Paul Cantrell (inthehands@hachyderm.io)'s status on Tuesday, 10-Feb-2026 01:35:36 JST Paul Cantrell Paul Cantrell
      in reply to

      There’s a lesson here, perhaps, about the tangled relationship between what is •typical• and what is •correct•, and what it is that LLMs actually do:

      When medical professionals ask medical questions in technical medical language, the answers they get are typically correct.

      When non-professional ask medical questions in a perhaps medically ill-formed vernacular mode, the answers they get are typically wrong.

      The LLM readily models both of these things. Despite having no notion of correctness in either case, correctness is more statistically typical in one than the other.

      3/

      In conversation about a month ago permalink
    • Embed this notice
      Brian Marick (marick@mstdn.social)'s status on Tuesday, 10-Feb-2026 06:39:23 JST Brian Marick Brian Marick
      in reply to

      @inthehands An aside. When people used to ask Dawn wasn’t it hard to treat animals because “they can’t tell you what’s wrong,” she’d answer that they also can’t lie about it. She thought the latter probably outweighed the former.

      In conversation about a month ago permalink
    • Embed this notice
      Paul Cantrell (inthehands@hachyderm.io)'s status on Tuesday, 10-Feb-2026 06:39:39 JST Paul Cantrell Paul Cantrell
      in reply to

      RE: https://girlcock.club/@miss_rodent/116041738842160668

      This is another, crisper way of saying what I meant by the previous post: if it sounds like a medical textbook, you’re more likely get a diagnosis; if it sounds like a tweet, you’re more likely to get a diagnosis.

      The tone, vocabulary, and style of the question change the likelihood that the answer is correct.

      4/

      In conversation about a month ago permalink

      Attachments

      1. No result found on File_thumbnail lookup.
        V (@miss_rodent@girlcock.club)
        from V
        @inthehands@hachyderm.io This result makes sense - they generate *statistically likely* text based on a prompt, and the stolen words of basically the entire internet and several libraries worth of books. If the prompt is such that the text it generates is statistically-likely to be correct - the language used closely aligns with a medical textbook, diagnostic manual, etc. - it's more likely to generate text based on sources like that. If it sounds like a tweet, you're more likely to get a shitpost.
    • Embed this notice
      Paul Cantrell (inthehands@hachyderm.io)'s status on Tuesday, 10-Feb-2026 06:39:53 JST Paul Cantrell Paul Cantrell
      in reply to
      • Brian Marick

      @marick
      That’s profound.

      (Though also: I know that guinea pigs can be notoriously difficult to diagnose because, as prey animals, they’re very good at hiding that they have a problem!)

      In conversation about a month ago permalink

Feeds

  • Activity Streams
  • RSS 2.0
  • Atom
  • Help
  • About
  • FAQ
  • TOS
  • Privacy
  • Source
  • Version
  • Contact

GNU social JP is a social network, courtesy of GNU social JP管理人. It runs on GNU social, version 2.0.2-dev, available under the GNU Affero General Public License.

Creative Commons Attribution 3.0 All GNU social JP content and data are available under the Creative Commons Attribution 3.0 license.