GNU social JP
  • FAQ
  • Login
GNU social JPは日本のGNU socialサーバーです。
Usage/ToS/admin/test/Pleroma FE
  • Public

    • Public
    • Network
    • Groups
    • Featured
    • Popular
    • People

Conversation

Notices

  1. Embed this notice
    Matt Wilcox (mattwilcox@mstdn.social)'s status on Sunday, 25-Feb-2024 22:26:56 JST Matt Wilcox Matt Wilcox

    Back when I was a kid at school we had IT lessons. One of the first things we were taught was “GIGO: Garbage in; Garbage out.”

    A computer supplied with rubbish can only output rubbish.

    Anyway; if you train LLMs on “everything out there”, you’re going to get “general shit” out. It can’t know what’s good or bad or right. Just what’s most likely. Think of the most average person imaginable with the most average hot take.

    Let me know when you can train LLM’s locally on curated content.

    In conversation Sunday, 25-Feb-2024 22:26:56 JST from mstdn.social permalink
    • Embed this notice
      Matt Wilcox (mattwilcox@mstdn.social)'s status on Sunday, 25-Feb-2024 22:26:54 JST Matt Wilcox Matt Wilcox
      in reply to

      Let me train my models locally on archived Wikipedia. Let me toss in my own trusted sources like MDN.

      I think the real value source in all of this is _well curated datasets_. All the hype of the models themselves is largely going to die out. The initial strategy of feeding enormous amounts of any-quality-data has such a narrow viable use case IMO; which is “understanding what is being asked”; but not “answering the question being asked”.

      In conversation Sunday, 25-Feb-2024 22:26:54 JST permalink
    • Embed this notice
      Matt Wilcox (mattwilcox@mstdn.social)'s status on Sunday, 25-Feb-2024 22:26:55 JST Matt Wilcox Matt Wilcox
      in reply to

      I’m hoping that some Open Source project will start collecting and distributing _trusted data collections_ as raw material to train LLM’s on. Thats where the value is; not the models themselves. Which are mostly trying to un-shittify the shit they got fed.

      To me, the biggest issue with the whole thing is _I do not want something trained on the entirety of crap out there_. We _all_ know that most “content” available is biased, incorrect, racist, ignorant, hot garbage.

      In conversation Sunday, 25-Feb-2024 22:26:55 JST permalink

      Attachments


      H repeated this.

Feeds

  • Activity Streams
  • RSS 2.0
  • Atom
  • Help
  • About
  • FAQ
  • TOS
  • Privacy
  • Source
  • Version
  • Contact

GNU social JP is a social network, courtesy of GNU social JP管理人. It runs on GNU social, version 2.0.2-dev, available under the GNU Affero General Public License.

Creative Commons Attribution 3.0 All GNU social JP content and data are available under the Creative Commons Attribution 3.0 license.