GNU social JP
  • FAQ
  • Login
GNU social JPは日本のGNU socialサーバーです。
Usage/ToS/admin/test/Pleroma FE
  • Public

    • Public
    • Network
    • Groups
    • Featured
    • Popular
    • People

Conversation

Notices

  1. Embed this notice
    ⍨ (chaz@burn.capital)'s status on Saturday, 26-Oct-2024 23:38:09 JST ⍨ ⍨

    As the OSI prepares to make official its "open source AI" definition with a glaring lack of requirement that the actual source (training data) is made available, it's worth noting that their work is funded by google, meta, microsoft, salesforce, etc. What does open source even mean here if the literal source of the model isn't open? These companies are invested in making you think they're on your side while they boil the oceans to avoid paying human beings for labor.

    The idea behind open source, as it grew out of the free software movement, has always been to water down software freedoms, to create something more palatable to corporate interests that *sounds* good but means very little. This continues that work for the current "gen AI" bubble. It's time to ditch open source as an ideal, and the OSI especially.

    https://opensource.org/ai/drafts/the-open-source-ai-definition-1-0-rc2

    #OpenSource #OpenSourceAI #OSI #OpenSourceInitiative #FreeSoftware #AI #GenAI #GenerativeAI

    In conversation about 7 months ago from burn.capital permalink
    • anban repeated this.
    • Embed this notice
      ⍨ (chaz@burn.capital)'s status on Saturday, 02-Nov-2024 04:12:04 JST ⍨ ⍨
      in reply to

      They posit you can still modify (tune) the distributed models without the training source. You can also modify a binary executable without its source code. Frankly that's unacceptable if we actually care about the human beings using the software.

      A key pillar of freedom as it relates to software is reproducibility. The ability to build a tool from scratch, in your own environment, with your own parameters, is absolutely indispensable to both learning how the tool works and changing the tool to better serve your needs, especially if your needs fall on the outskirts of the bell curve.

      There's also the issue of auditability. If you can't run the full build process yourself, producing your own results from scratch in a trusted environment to compare with what's distributed, it becomes exponentially harder to verify any claims about how a tool supposedly works.

      Without the training data, this all becomes impossible for AI models. The OSI knows this. They're choosing to ignore it for the sake of expediency for the companies paying their bills, who want to claim "open" because it sounds good while actually hiding the (largely stolen and fraudulently or non-consentually acquired) source material of their current models.

      Do we want a new definition of "open source" that actively thwarts analysis and tinkering, two fundamental requirements of software that respects human beings today? Reject this nonsense.

      #OpenSource #OpenSourceAI #OSI #OpenSourceInitiative #FreeSoftware #AI #GenAI #GenerativeAI

      In conversation about 7 months ago permalink

      Attachments


    • Embed this notice
      Alexandre Oliva (moving to @lxo@snac.lx.oliva.nom.br) (lxo@gnusocial.jp)'s status on Tuesday, 05-Nov-2024 06:08:18 JST Alexandre Oliva (moving to @lxo@snac.lx.oliva.nom.br) Alexandre Oliva (moving to @lxo@snac.lx.oliva.nom.br)
      in reply to
      • 🎓 Doc Freemo :jpf: 🇳🇱
      • Josh Gay
      • Stefano Zacchiroli
      I'm in no way related with OSI, and I know very little of current LLM tech, but I've been thinking a lot about this issue from a software freedom philosophical perspective, trying to figure out how essential training data is for users to have the four essential freedoms.

      it's not obvious to me whether having access to the training data places users and developers at an advantage or at a disadvantage compared with those that don't have access to it. training data is so massive, and the link from any of it to the system's behavior is so subtle, that it seems conceivable to me that probing the system's behavior and relying on incremental training might be more efficient and more reliable than analysis of the training set, for at least some past, current and future technology.

      since I don't know enough about current systems to tell, I set out to devise a method to find the answer to that question. I'm thinking that an adversarial setting, in which users/developers who have access to the training data compete with users/developers who don't to find answers to questions about how the system works, and to modify the system so that it does what is requested (these are analogous to freedom #1), with questions and change requests coming from adversarial proponents. this would be a kind of Turing test on whether any given system respects freedom #1 (the other freedoms are much easier to tell), and it could be applied to any future such systems as well. has any such thing been considered? does it seem worth doing, or even thinking more of? cc: @joshuagay @zacchiro @freemo
      In conversation about 7 months ago permalink
      🎓 Doc Freemo :jpf: 🇳🇱 likes this.
    • Embed this notice
      Stefano Zacchiroli (zacchiro@mastodon.xyz)'s status on Tuesday, 05-Nov-2024 19:10:20 JST Stefano Zacchiroli Stefano Zacchiroli
      in reply to
      • Alexandre Oliva (moving to @lxo@snac.lx.oliva.nom.br)

      @lxo @chaz It's a good approach, but I don't think it's needed.

      If we start from first principles, there's no doubt that to fully exercise freedoms of study and modify, you need the training data. (You can exercise *some* of those freedoms even without training data, but in a suboptimal way. I can give precise examples if you're curious.)

      The "data is too big" problem is IMO a distraction. There are relevant ML systems that are small enough to make retraining them from scratch viable.

      In conversation about 7 months ago permalink
    • Embed this notice
      Stefano Zacchiroli (zacchiro@mastodon.xyz)'s status on Tuesday, 05-Nov-2024 19:17:00 JST Stefano Zacchiroli Stefano Zacchiroli
      in reply to
      • Alexandre Oliva (moving to @lxo@snac.lx.oliva.nom.br)

      @lxo @chaz AFAICT even OSI recognizes this. Their main arguments against mandating training data in OSAID are of a pragmatic nature, related to (1) the current state of the industry, (2) the legal regimes that apply to data, which are different from those of (free) software.

      I disagree with the decision taken based on those arguments, but I understand them. Either way, they don't call into question the fact that training data is *better* than no training data to exercise user freedoms.

      In conversation about 7 months ago permalink
    • Embed this notice
      Alexandre Oliva (moving to @lxo@snac.lx.oliva.nom.br) (lxo@gnusocial.jp)'s status on Thursday, 07-Nov-2024 12:33:42 JST Alexandre Oliva (moving to @lxo@snac.lx.oliva.nom.br) Alexandre Oliva (moving to @lxo@snac.lx.oliva.nom.br)
      in reply to
      • Stefano Zacchiroli
      I am curious, and I'd welcome both precise and imprecise examples ;-) thanks in advance
      In conversation about 6 months ago permalink
    • Embed this notice
      mangeurdenuage :gnu: :trisquel: :gondola_head: 🌿 :abeshinzo: :ignucius: (mangeurdenuage@shitposter.world)'s status on Thursday, 07-Nov-2024 23:28:52 JST mangeurdenuage :gnu: :trisquel: :gondola_head: 🌿 :abeshinzo: :ignucius: mangeurdenuage :gnu: :trisquel: :gondola_head: 🌿 :abeshinzo: :ignucius:
      in reply to
      @chaz
      AI isn't an issue imo, proprietary AI is tho.
      https://www.fsf.org/news/fsf-is-working-on-freedom-in-machine-learning-applications
      In conversation about 6 months ago permalink

      Attachments

      1. No result found on File_thumbnail lookup.
        FSF is working on freedom in machine learning applications — Free Software Foundation — Working together for free software
        from //about/staff/
        The FSF is a charity with a worldwide mission to advance software freedom.
    • Embed this notice
      ⍨ (chaz@burn.capital)'s status on Saturday, 09-Nov-2024 00:54:04 JST ⍨ ⍨
      in reply to

      You don't have to take my word for it, here's Schneier himself saying this open source AI definition is "terrible":

      https://www.schneier.com/blog/archives/2024/11/ai-industry-is-trying-to-subvert-the-definition-of-open-source-ai.html

      In conversation about 6 months ago permalink
    • Embed this notice
      fu (fu@libranet.de)'s status on Saturday, 09-Nov-2024 01:10:19 JST fu fu
      in reply to
      @chaz yet another example of why the open source development model isn't what is important. Respecting freedom is what matters. Open source is antithetical to Free Software.
      In conversation about 6 months ago permalink

Feeds

  • Activity Streams
  • RSS 2.0
  • Atom
  • Help
  • About
  • FAQ
  • TOS
  • Privacy
  • Source
  • Version
  • Contact

GNU social JP is a social network, courtesy of GNU social JP管理人. It runs on GNU social, version 2.0.2-dev, available under the GNU Affero General Public License.

Creative Commons Attribution 3.0 All GNU social JP content and data are available under the Creative Commons Attribution 3.0 license.