GNU social JP
  • FAQ
  • Login
GNU social JPは日本のGNU socialサーバーです。
Usage/ToS/admin/test/Pleroma FE
  • Public

    • Public
    • Network
    • Groups
    • Featured
    • Popular
    • People

Conversation

Notices

  1. Embed this notice
    tante (tante@tldr.nettime.org)'s status on Thursday, 16-Jan-2025 03:34:40 JST tante tante

    Cool project: "Nepenthes" is a tarpit to catch (AI) web crawlers.

    "It works by generating an endless sequences of pages, each of which with dozens of links, that simply go back into a the tarpit. Pages are randomly generated, but in a deterministic way, causing them to appear to be flat files that never change. Intentional delay is added to prevent crawlers from bogging down your server, in addition to wasting their time. Lastly, optional Markov-babble can be added to the pages, to give the crawlers something to scrape up and train their LLMs on, hopefully accelerating model collapse."

    https://zadzmo.org/code/nepenthes/

    In conversation about 5 months ago from tldr.nettime.org permalink

    Attachments

    1. No result found on File_thumbnail lookup.
      ZADZMO code
      from https://zadzmo.org/humans.txt
    • Embed this notice
      Rich Felker (dalias@hachyderm.io)'s status on Thursday, 16-Jan-2025 03:37:20 JST Rich Felker Rich Felker
      in reply to
      • altruios phasma

      @altruios @tante Because what they're doing is without consent, in violation of law in ways that normal ppl have had their lives ruined over, but they're backed by asshole billionaires so it's fine when they do it. We all benefit from sabotaging their scam products.

      In conversation about 5 months ago permalink
    • Embed this notice
      altruios phasma (altruios@mastodon.social)'s status on Thursday, 16-Jan-2025 03:37:21 JST altruios phasma altruios phasma
      in reply to

      @tante I have mixed feelings.

      Crawlers should respect robots.txt….

      At the same time: there is clearly an emotionally based bias happening with LLM’s.

      I feel weird about the idea of actively sabotaging. Considering it is only towards bad actors… and considering maybe robots.txt often are too restrictive in my opinion… the gray areas overlap a bit.

      Why should we want to actively sabatoge AI dev? Wouldn’t that lead to possible catastrophic results? Who benefits from dumber ai?

      In conversation about 5 months ago permalink
    • Embed this notice
      Paul Cantrell (inthehands@hachyderm.io)'s status on Thursday, 16-Jan-2025 03:38:42 JST Paul Cantrell Paul Cantrell
      in reply to
      • Rich Felker

      @tante @dalias
      Heh, https://wookieepedia.org has been functioning this way for me I opened up its robots.txt: it's dynamically generated on demand, ALL links work, and ALL pages exist. It’s generated so much sustained load that I may need to throttle it too!

      In conversation about 5 months ago permalink

      Attachments

      1. Domain not in remote thumbnail source whitelist: wookieepedia.org
        Main Page - Wookieepedia, the hirsute encyclopedia
    • Embed this notice
      Marsh Gardiner 💡🐝🔧 (earth2marsh@hachyderm.io)'s status on Thursday, 16-Jan-2025 04:44:13 JST Marsh Gardiner 💡🐝🔧 Marsh Gardiner 💡🐝🔧
      in reply to
      • Rich Felker
      • Paul Cantrell

      @inthehands @tante @dalias OMG, that is a spectacular use of "hirsute." Bravo.

      In conversation about 5 months ago permalink

Feeds

  • Activity Streams
  • RSS 2.0
  • Atom
  • Help
  • About
  • FAQ
  • TOS
  • Privacy
  • Source
  • Version
  • Contact

GNU social JP is a social network, courtesy of GNU social JP管理人. It runs on GNU social, version 2.0.2-dev, available under the GNU Affero General Public License.

Creative Commons Attribution 3.0 All GNU social JP content and data are available under the Creative Commons Attribution 3.0 license.