GNU social JP
  • FAQ
  • Login
GNU social JPは日本のGNU socialサーバーです。
Usage/ToS/admin/test/Pleroma FE
  • Public

    • Public
    • Network
    • Groups
    • Featured
    • Popular
    • People

Conversation

Notices

  1. Embed this notice
    Tim Chambers (tchambers@indieweb.social)'s status on Saturday, 06-Apr-2024 01:11:16 JST Tim Chambers Tim Chambers

    Those worried Threads might scrape the Fedi for AI training data should rest assured that EVERY bit of the public Fedi is ALREADY being scraped and used by all the AI players. Because it is free and public, compared to this:

    "Rates vary by buyer and content type, but Braga said companies are generally willing to pay $1 to $2 per image, $2 to $4 per short-form video and $100 to $300 per hour of longer films. The market rate for text is $0.001 per word."

    https://www.reuters.com/technology/inside-big-techs-underground-race-buy-ai-training-data-2024-04-05/

    In conversation about a year ago from indieweb.social permalink

    Attachments


    • Embed this notice
      Tim Chambers (tchambers@indieweb.social)'s status on Saturday, 06-Apr-2024 02:01:59 JST Tim Chambers Tim Chambers
      in reply to
      • Ulrike Hahn

      @UlrikeHahn True. But federated content, going out to other servers who do not, still makes it so the public Fediverse content is all up for grabs.

      In conversation about a year ago permalink
    • Embed this notice
      Ulrike Hahn (ulrikehahn@fediscience.org)'s status on Saturday, 06-Apr-2024 02:02:00 JST Ulrike Hahn Ulrike Hahn
      in reply to

      @tchambers you can block GPTBot in robot.txt

      https://www.theverge.com/2023/8/7/23823046/openai-data-scrape-block-ai

      In conversation about a year ago permalink
    • Embed this notice
      Tim Chambers (tchambers@indieweb.social)'s status on Saturday, 06-Apr-2024 02:12:00 JST Tim Chambers Tim Chambers
      in reply to
      • Ulrike Hahn

      @UlrikeHahn Well, I'd think if anyone follows, or boosts, or likes your content, then it is out there... plus there are relays that boost content from many Fediverse server public feeds, etc.

      This doesn't count intentional "scraping" of all public fediverse posts that clearly is also happening. For example even with search indexing marked as "off" on our server, see this:

      https://www.google.com/search?q=%40tchambers%40indieweb.social&sca_esv=e1291adcbee06b02&sca_upv=1&source=hp&ei=ljAQZrn2FJjl5NoPw4cm&iflsig=ANes7DEAAAAAZhA-poo7ekCzmuxLDAqodmpRmX5PMTMs&ved=0ahUKEwi5mYjLyKuFAxWYMlkFHcODCQAQ4dUDCBc&uact=5&oq=%40tchambers%40indieweb.social&gs_lp=Egdnd3Mtd2l6IhpAdGNoYW1iZXJzQGluZGlld2ViLnNvY2lhbEiZY1CRAlj4YHAXeACQAQKYAegCoAHTH6oBCDQ2LjMuMC4xuAEDyAEA-AEBmAI2oAK_F6gCCsICEBAAGAMYjwEY5QIY6gIYjAPCAgsQABiABBixAxiDAcICDhAuGIAEGLEDGMcBGNEDwgIREC4YgAQYsQMYgwEYxwEY0QPCAgsQLhiABBjHARjRA8ICCBAAGIAEGLEDwgIOEAAYgAQYigUYsQMYgwHCAggQLhiABBixA8ICExAAGIAEGIoFGLEDGIMBGEYY-QHCAhEQLhiABBiKBRixAxiDARjUAsICBRAAGIAEwgILEC4YgAQYsQMYgwHCAgQQABgDwgIOEC4YgAQYigUYsQMYgwHCAgUQLhiABMICBBAAGB7CAgYQABgeGArCAgYQLhgeGArCAgYQABgIGB7CAgkQABgeGMkDGArCAgsQABiABBiKBRiSA8ICCBAAGAUYHhgKwgIIEC4YBRgeGArCAgoQABgFGB4YDxgKwgIGEAAYBRgewgIIEAAYgAQYogSYAwSSBwQ1MS4zoAfbkgI&sclient=gws-wiz#ip=1

      In conversation about a year ago permalink
    • Embed this notice
      Ulrike Hahn (ulrikehahn@fediscience.org)'s status on Saturday, 06-Apr-2024 02:12:01 JST Ulrike Hahn Ulrike Hahn
      in reply to

      @tchambers but it only goes out to servers in as much as there is someone there that follows me. If I have a small account it’s not unlikely that I would escape being scraped., no? certainly if awareness of this is raised…

      In conversation about a year ago permalink

Feeds

  • Activity Streams
  • RSS 2.0
  • Atom
  • Help
  • About
  • FAQ
  • TOS
  • Privacy
  • Source
  • Version
  • Contact

GNU social JP is a social network, courtesy of GNU social JP管理人. It runs on GNU social, version 2.0.2-dev, available under the GNU Affero General Public License.

Creative Commons Attribution 3.0 All GNU social JP content and data are available under the Creative Commons Attribution 3.0 license.