GNU social JP
  • FAQ
  • Login
GNU social JPは日本のGNU socialサーバーです。
Usage/ToS/admin/test/Pleroma FE
  • Public

    • Public
    • Network
    • Groups
    • Featured
    • Popular
    • People

Conversation

Notices

  1. Embed this notice
    lucie lukas "minute" hartmann (mntmn@mastodon.social)'s status on Tuesday, 02-Dec-2025 23:09:55 JST lucie lukas "minute" hartmann lucie lukas "minute" hartmann

    in terms of "finding things in large texts", for example "find a page in this pdf that mentions both shutdown mode and reg18", are there interesting alternatives to all that llm stuff beyond regex search? are there natural language processing systems that are precise/reliable and understandable? i imagine something like a fuzzy parser with stemming and some sort of ontologies, synonyms and logical inference

    In conversation about 19 days ago from mastodon.social permalink
    • Embed this notice
      lucie lukas "minute" hartmann (mntmn@mastodon.social)'s status on Tuesday, 02-Dec-2025 23:09:55 JST lucie lukas "minute" hartmann lucie lukas "minute" hartmann
      in reply to

      i don't like llms because they consume a lot of power and are connected to all the ai greed hype, they have to be strangely trained, their representations are not introspectable, they make tons of errors/are not reliable at all etc. i'd rather like a sharp, more machinistic tool that just clearly says "error" when it can't do the job. grep is such a tool--would be nice to have a grep that can clean up and normalize messy human language a bit

      In conversation about 19 days ago permalink
      Haelwenn /элвэн/ :triskell: likes this.
    • Embed this notice
      Filip M. Nowak (fmn@mastodon.social)'s status on Tuesday, 02-Dec-2025 23:11:57 JST Filip M. Nowak Filip M. Nowak
      in reply to

      @mntmn "are there natural language processing systems that are precise/reliable and understandable?" - let me wear my noam chomsky hat for a second: there are no such systems and never will be. natural language is ever changing and ambiguous, and parties involved often don't have - sufficiently - common context required for precise communication. this is why people talking or even reading have so many back-and-forths.

      In conversation about 19 days ago permalink
      Haelwenn /элвэн/ :triskell: likes this.
    • Embed this notice
      Wolf480pl (wolf480pl@mstdn.io)'s status on Tuesday, 02-Dec-2025 23:52:10 JST Wolf480pl Wolf480pl
      in reply to
      • Janne Moren

      @jannem @mntmn
      AFAIK the way these LLM tools work is they have an embedding of words into a vector space, they index text by converting every word in a every document to a vector, and storing it in a database together with ID of the document it came from, and then when you search, they turn each of the query words into vectors, and search for K nearest neighbors in the vector space for each of them.

      Then they feed the documents they found to an LLM.

      What if you skipped the last step?

      In conversation about 19 days ago permalink
      Doughnut Lollipop 【記録係】:blobfoxgooglymlem: likes this.
    • Embed this notice
      lucie lukas "minute" hartmann (mntmn@mastodon.social)'s status on Tuesday, 02-Dec-2025 23:52:11 JST lucie lukas "minute" hartmann lucie lukas "minute" hartmann
      in reply to
      • Janne Moren

      @jannem i somehow find that hard to believe

      In conversation about 19 days ago permalink
    • Embed this notice
      Janne Moren (jannem@fosstodon.org)'s status on Tuesday, 02-Dec-2025 23:52:11 JST Janne Moren Janne Moren
      in reply to

      @mntmn
      I mean, there's been many attempts. Especially for constrained applications such as a corporate document store and things like that. As far as I know, none of those systems were ever a success.

      In conversation about 19 days ago permalink
    • Embed this notice
      Janne Moren (jannem@fosstodon.org)'s status on Tuesday, 02-Dec-2025 23:52:12 JST Janne Moren Janne Moren
      in reply to

      @mntmn
      No, not really. And that's a reason why small LLMs as language processors (not chatbots) are exciting.

      In conversation about 19 days ago permalink

Feeds

  • Activity Streams
  • RSS 2.0
  • Atom
  • Help
  • About
  • FAQ
  • TOS
  • Privacy
  • Source
  • Version
  • Contact

GNU social JP is a social network, courtesy of GNU social JP管理人. It runs on GNU social, version 2.0.2-dev, available under the GNU Affero General Public License.

Creative Commons Attribution 3.0 All GNU social JP content and data are available under the Creative Commons Attribution 3.0 license.