Conversation

Notices

Embed this notice
Ernie Smith (ernie@writing.exchange)'s status on Saturday, 29-Jun-2024 05:50:39 JST Ernie Smith

The irony is not lost on me that the Internet Archive went out of its way to acquire the physical versions of millions of books and loan them out carefully and in a limited way, and is facing a near-extinction-level event over it, while for-profit and VC-backed companies are just stealing people’s content and making up excuses to validate the bad behavior.

In conversation Saturday, 29-Jun-2024 05:50:39 JST from writing.exchange permalink
- Haelwenn /элвэн/ :triskell:, Alexandre Oliva and DCent like this.
- Embed this notice
  lainy (lain@lain.com)'s status on Saturday, 29-Jun-2024 22:43:30 JST lainy
  in reply to
  - feld
  - Mitch Effendi (ميتش أفندي)
  @feld @mitch @ernie there's no way to square this circle anyway, copyright is a nonsensical 'right' and i'm happy that llms are exposing it in the way that they are
  
  In conversation Saturday, 29-Jun-2024 22:43:30 JST permalink
- Embed this notice
  feld (feld@bikeshed.party)'s status on Saturday, 29-Jun-2024 22:43:31 JST feld
  in reply to
  - Mitch Effendi (ميتش أفندي)
  @mitch @ernie
  
  > coaxing an LLM to reproduce a reference work basically in full is pretty established research at this point.
  
  If that was the case Sarah Silverman's lawsuit wouldn't be going so poorly. They even avoid claiming it can produce the reference works.
  
  In conversation Saturday, 29-Jun-2024 22:43:31 JST permalink
- Embed this notice
  Mitch Effendi (ميتش أفندي) (mitch@posts.dumb.stuff.donaberger.xyz)'s status on Saturday, 29-Jun-2024 22:43:32 JST Mitch Effendi (ميتش أفندي)
  in reply to
  
  @ernie after all, coaxing an LLM to reproduce a reference work basically in full is pretty established research at this point. We know it's possible — it's how the tech started, by being able to reproduce a ground truth image despite never having actually been exposed to the original file.
  I dunno, I'm just some idiot online.
  
  In conversation Saturday, 29-Jun-2024 22:43:32 JST permalink
- Embed this notice
  Mitch Effendi (ميتش أفندي) (mitch@posts.dumb.stuff.donaberger.xyz)'s status on Saturday, 29-Jun-2024 22:43:33 JST Mitch Effendi (ميتش أفندي)
  in reply to
  
  @ernie you know, I have to wonder if the inaction on prosecuting LLM training companies actually introduced a legal loophole for libraries.
  Consider that right now, the American legal standard is that GenAI output is considered a derivative work, even if it derived it from 30 billion works. I wonder if the Internet Archive "chunked" editions of books together into a specialized model, could they then "loan" the book out by inferencing a near exact but legally 'distinct' copy of that work?
  
  In conversation Saturday, 29-Jun-2024 22:43:33 JST permalink
- Embed this notice
  tom jennings (tomjennings@tldr.nettime.org)'s status on Sunday, 30-Jun-2024 00:58:50 JST tom jennings
  in reply to
  - jonny (nonvenomous)
  @ernie @jonny
  I honestly think that if IA's intent were profit they'd have an easier time arguing.
  But we'll see what the judges do this fall...
  
  In conversation Sunday, 30-Jun-2024 00:58:50 JST permalink
  
  Haelwenn /элвэн/ :triskell: likes this.
- Embed this notice
  Alexandre Oliva (lxo@gnusocial.net)'s status on Sunday, 30-Jun-2024 17:20:11 JST Alexandre Oliva
  in reply to
  - Brewster Kahle
  your post gave me the following idea:
  the archive should train some LLM on all of those books, and then publish the trained model.
  who'd want to borrow the books under DRM if they can have a locally-running LLM that can search, summarize or even "write" them on demand?
  crossing these rays would pit the LLM giants against the book MAFIAA. in such a fight, we should all be rooting for the fight, but if it brings LLM giants to defend the Internet Archive, that could be good?
  cc: @brewsterkahle
  
  In conversation Sunday, 30-Jun-2024 17:20:11 JST permalink
  
  DCent likes this.
- Embed this notice
  mybarkingdogs (mybarkingdogs@freeradical.zone)'s status on Saturday, 06-Jul-2024 00:36:28 JST mybarkingdogs
  in reply to
  
  @ernie Copyright is a tool of those rich enough to enforce it #AbolishCopyright
  
  In conversation Saturday, 06-Jul-2024 00:36:28 JST permalink
  
  DCent likes this.
- Embed this notice
  DCent (dcent@gnusocial.net)'s status on Saturday, 06-Jul-2024 01:51:53 JST DCent
  in reply to
  @ernie @mybarkingdogs @pbaesse @brewsterkahle
  Yes Alexandre, I basically had the same thoughts when reading what Ernie wrote. Thanks Ernie. And the picture is even worse for the Internet Archive, and other archives, given that these #LLM (both of those 'L' stand for 'leeching', don't they?) are at the same time acting as denial of service attackers.
  
  There is another aspect, how can it be that basically our media is owned by a handful of corporations, that are also able to not only profit from the sale of the content, but are now able to watch the people watching the content. Its abuse, that a well trained LLM could help mitigate.
  
  I'm an #I2P maximalist and I've seen these LLM supposedly available over I2P torrent, they are in the tens to hundreds of gigabytes. I don't know if its worth the FSF operating one that is able to be trained on different softwares so as to help people setup, customise and enjoy freedom softwares. The subdomain might be a derogatory take on the #AI acronym, like aidiot.fsf.org and maybe it can be trained to tell people that it may talk complete garbage, so check sources and or man(ual) pages if in doubt. Anyway I'm no expert in this area. I just don't like the (a)idea of it being used only by the large players to further entrench their power.
  In conversation Saturday, 06-Jul-2024 01:51:53 JST permalink
  Attachments
  1. Untitled attachment
- Embed this notice
  DCent (dcent@gnusocial.net)'s status on Saturday, 06-Jul-2024 02:15:34 JST DCent
  in reply to
  @lxo @ernie @mybarkingdogs @pbaesse @brewsterkahle
  This is not to disparage but it must be pointed out that the Internet Archive is huge and maybe it is wrong that we burden the IA with the task of sole #archivist across the web. I know that not everyone can operate an archive but we ought to have at least a couple archives on each continent that can at a minimum archive the content that originates on their continent, or content in their language. I'm very interested in finding a way to store and deliver content in a decentralised manner, where people might even be rewarded in some small way for hosting it.
  
  Maybe I need to ask #aidiot 🤖 how to set this up. 🤣🤣🤣
  
  (As an aside, if I may, I'd like to say a word related to the health of the fediverse that relates to the above point. And that is please be careful of large content delivery systems for fedi. This might include mastoHsot, which appears to host writing.exchange. Also includes Cloudflare, which delivers images for freeradical.zone or in the case of ursal.zone, afaict, delivers their entire service.)
  In conversation Saturday, 06-Jul-2024 02:15:34 JST permalink
  Attachments
  1. Domain not in remote thumbnail source whitelist: cdn.masto.host
    
    Ursalzona no Mastodon
    
    A URSAL.zone é uma instância moderada com foco em militantes progressistas, feministas e antifascistas da América Latina contrários a todas as formas de opressão, exploração e humilhação entre seres humanos. Discursos de ódio, propagação de mentiras e pornografia comercial são proibidos. Leia nosso Código de Conduta para mais informações.
  2. Domain not in remote thumbnail source whitelist: nfts.freeradical.zone
    
    Free Radical
    
    Infosec and privacy and technology and leftward politics and cats and dogs and...
- Embed this notice
  DCent (dcent@gnusocial.net)'s status on Sunday, 07-Jul-2024 03:32:58 JST DCent
  in reply to
  - Alexandre Oliva
  - Brewster Kahle
  @lxo @brewsterkahle
  
  > lets focus on the point at hand (...) too many good discussions get derailed
  
  Yes, I put that last note in parentheses and started with "as an aside" so as just provide a cautionary semi-related note. A lot of fedizens don't know this side of things, and I find many appreciate learning about it. I can't say I've ever seen a derailment by talking about this.
  
  In conversation Sunday, 07-Jul-2024 03:32:58 JST permalink
- Embed this notice
  Ernie Smith (ernie@writing.exchange)'s status on Sunday, 07-Jul-2024 03:33:02 JST Ernie Smith
  in reply to
  - Alexandre Oliva
  - DCent
  @dcent @lxo to nail into your broader point, having a single archive work as the internet's custodian doesn't make sense long term as it puts us at risk of broader legal challenges as we're facing.
  The LLM idea is very clever but I will point out that it seems like the courts don't have a great appetite for novel legal theories based on the fact that this whole debate hinges on controlled digital lending.
  We need better strategies for protecting archives.
  
  In conversation Sunday, 07-Jul-2024 03:33:02 JST permalink
- Embed this notice
  Ernie Smith (ernie@writing.exchange)'s status on Sunday, 07-Jul-2024 03:33:03 JST Ernie Smith
  in reply to
  - Alexandre Oliva
  - DCent
  @dcent @lxo let's focus on the point at hand, not the network where I post my content. Too many good discussions get derailed by debates like this
  
  In conversation Sunday, 07-Jul-2024 03:33:03 JST permalink
- Embed this notice
  DCent (dcent@gnusocial.net)'s status on Sunday, 07-Jul-2024 03:44:47 JST DCent
  in reply to
  @ernie @lxo @brewsterkahle I don't have experience with courts in this area, but you're probably right and thus an LLM might be an area to explore, not as an arguement in court but as a practical method (of burning more energy than is needed) and to help people find and understand works.
  
  I get the sense that the corporate actors have minimal or no interest in #archives, only until they need to cite something, then they appreciate it. Many websites are #copyright. How then can a distinction be made between someone requesting a copyrighted webpage and a copyrighted book. Is it a matter of payment? Webpages are delivered freely, unlike #books but if one were to bring together enough #webpages with enough quotes from a book or film you might be able to collect something that encompasses the entire work. Should works be immune from being archived because someone demands payment for them?
  
  Maybe #lawsuits of this nature are a sign that in an epoch where everything seems to be on a cost curve approaching zero, people are desperate to claw back as much income as possible and are now starting to attack basic #institutions. Maybe a universal basic income (#UBI) is needed to allow artists and writers of all ilks to survive, including #journalists, whose plight seems to be a recurring topic here in #Australia.
  
  In conversation Sunday, 07-Jul-2024 03:44:47 JST permalink

Public

Conversation

Notices

Feeds