GNU social JP
  • FAQ
  • Login
GNU social JPは日本のGNU socialサーバーです。
Usage/ToS/admin/test/Pleroma FE
  • Public

    • Public
    • Network
    • Groups
    • Featured
    • Popular
    • People

Conversation

Notices

  1. Embed this notice
    Codeberg.org (codeberg@social.anoxinon.de)'s status on Saturday, 16-Aug-2025 01:59:06 JST Codeberg.org Codeberg.org

    We apologize for a period of extreme slowness today. The army of AI crawlers just leveled up and hit us very badly.

    The good news: We're keeping up with the additional load of new users moving to Codeberg. Welcome aboard, we're happy to have you here. After adjusting the AI crawler protections, performance significantly improved again.

    In conversation about 2 months ago from social.anoxinon.de permalink
    • Rich Felker, Mauricio Teixeira 🇧🇷🇺🇲, Peter Krefting and Gianmarco Gargiulo repeated this.
    • Embed this notice
      翠星石 (suiseiseki@freesoftwareextremist.com)'s status on Saturday, 16-Aug-2025 01:59:01 JST 翠星石 翠星石
      in reply to
      @Codeberg >now that they managed to break through Anubis
      There was no break - it's a simple matter of changing the useragent, or if for some reason there's still a challenge, simply utilizing the plentiful computing power that is available on their servers (which far outstrips the processing power mobile devices have).

      Anubis is evil and is proprietary malware - please do not attack your users with proprietary malware.


      If you want to stop scraper bots, start serving GNUzip bombs - you can't scrape when your server RAM is full.

      dd if=/dev/zero bs=1G count=10 | gzip > /tmp/10GiB.gz
      dd if=/dev/zero bs=1G count=100 | gzip > /tmp/100GiB.gz
      dd if=/dev/zero bs=1G count=1025 | gzip > /tmp/1TiB.gz

      nginx; #serve gzip bombs
      location ~* /bombs-path/.*\.gz {
      add_header Content-Encoding "gzip";
      default_type "text/html";
      }

      #serve zstd bombs
      location ~* /bombs-path/.*\.zst {
      add_header Content-Encoding "zstd";
      default_type "text/html";
      }

      Then it's a matter of bait links that the user won't see, but bots will.
      In conversation about 2 months ago permalink
    • Embed this notice
      Codeberg.org (codeberg@social.anoxinon.de)'s status on Saturday, 16-Aug-2025 01:59:03 JST Codeberg.org Codeberg.org
      in reply to

      We have a list of explicitly blocked IP ranges. However, a configuration oversight on our part only blocked these ranges on the "normal" routes. The "anubis-protected" routes didn't consider the challenge. It was not a problem while Anubis also protected from the crawlers on the other routes.

      However, now that they managed to break through Anubis, there was nothing stopping these armies.

      It took us a while to identify and fix the config issue, but we're safe again (for now).

      In conversation about 2 months ago permalink

      Attachments


      SuperDicq repeated this.
    • Embed this notice
      Codeberg.org (codeberg@social.anoxinon.de)'s status on Saturday, 16-Aug-2025 01:59:04 JST Codeberg.org Codeberg.org
      in reply to

      However, we can confirm that at least Huawei networks now send the challenge responses and they actually do seem to take a few seconds to actually compute the answers. It looks plausible, so we assume that AI crawlers leveled up their computing power to emulate more of real browser behaviour to bypass the diversity of challenges that platform enabled to avoid the bot army.

      In conversation about 2 months ago permalink
    • Embed this notice
      Codeberg.org (codeberg@social.anoxinon.de)'s status on Saturday, 16-Aug-2025 01:59:05 JST Codeberg.org Codeberg.org
      in reply to

      It seems like the AI crawlers learned how to solve the Anubis challenges. Anubis is a tool hosted on our infrastructure that requires browsers to do some heavy computation before accessing Codeberg again. It really saved us tons of nerves over the past months, because it saved us from manually maintaining blocklists to having a working detection for "real browsers" and "AI crawlers".

      In conversation about 2 months ago permalink
      Haelwenn /элвэн/ :triskell: likes this.
      Rich Felker and Gianmarco Gargiulo repeated this.
    • Embed this notice
      翠星石 (suiseiseki@freesoftwareextremist.com)'s status on Saturday, 16-Aug-2025 02:34:24 JST 翠星石 翠星石
      in reply to
      @Codeberg If you hang every single scraper that comes along, that not only protects you, but also everyone else that it's scraping.
      In conversation about 2 months ago permalink
    • Embed this notice
      Codeberg.org (codeberg@social.anoxinon.de)'s status on Saturday, 16-Aug-2025 02:34:26 JST Codeberg.org Codeberg.org
      in reply to
      • 翠星石

      @Suiseiseki BTW, we're also actively following the work around iocaine, e.g. https://come-from.mad-scientist.club/@algernon/statuses/01K2N54XEVTEYYAASHZ0P48FBT

      However, as far as we can see, it does not sufficiently protect from crawling. As the bot armies successfully spread over many servers and addresses, damaging one of them doesn't prevent the next one from doing harmful requests, unfortunately. ~f

      In conversation about 2 months ago permalink

      Attachments


    • Embed this notice
      Codeberg.org (codeberg@social.anoxinon.de)'s status on Saturday, 16-Aug-2025 02:34:27 JST Codeberg.org Codeberg.org
      in reply to
      • 翠星石

      @Suiseiseki Anubis is the option that saved us a lot of work over the past months. We are not happy about it being open core or using GitHub sponsors, but we acknowledge the position from the maintainer: https://codeberg.org/forgejo/discussions/issues/319#issuecomment-6382369

      Calling our usage of anubis an attack on our users is far-fetched. But feel free to move elsewhere, or host an alternative without resorting to extreme measures. We're happy to see working proof that any other protection can be scaled up to the level of Codeberg. ~f

      In conversation about 2 months ago permalink

      Attachments

      1. Domain not in remote thumbnail source whitelist: codeberg.org
        Anubis - using proof-of-work to stop AI crawlers
        from forgejo
        - https://xeiaso.net/notes/2025/amazon-crawler/ - https://anubis.techaro.lol/ - https://anubis.techaro.lol/docs/design/how-anubis-works This solution came up more than once in the context of mitigating excessive crawling (either identified as AI or not). What do people think about it?
    • Embed this notice
      Rich Felker (dalias@hachyderm.io)'s status on Saturday, 16-Aug-2025 02:42:32 JST Rich Felker Rich Felker
      in reply to

      @Codeberg If some of the attack is coming from Huawei's cloud hosting, it might be worth sending a complaint to their abuse department. IME Chinese companies tend to be scared of breaking rules in international dealings like this.

      In conversation about 2 months ago permalink
    • Embed this notice
      SuperDicq (superdicq@minidisc.tokyo)'s status on Saturday, 16-Aug-2025 02:43:58 JST SuperDicq SuperDicq
      in reply to
      • 翠星石

      @Codeberg@social.anoxinon.de @Suiseiseki@freesoftwareextremist.com A lot of users can not pass Anubis challenges because Anubis does not support every browser and is also incompatible with popular security focussed browser extensions such as JShelter.

      Asking your users to enable JavaScript and to disable security extensions like JShelter in order to visit your website is very bad, don't you agree?

      I don't think it is far-fetched to call it an attack on your users at all.

      In conversation about 2 months ago permalink
      翠星石 likes this.
    • Embed this notice
      翠星石 (suiseiseki@freesoftwareextremist.com)'s status on Saturday, 16-Aug-2025 02:46:16 JST 翠星石 翠星石
      in reply to
      • Stefano Zacchiroli
      @zacchiro @Codeberg Yes, the major problem with Anubis is that the only people who get impacted by it are the users - bots either bypass it, or don't run it, or have more computing computer power and are happy to wait for a long time.
      In conversation about 2 months ago permalink
    • Embed this notice
      Stefano Zacchiroli (zacchiro@mastodon.xyz)'s status on Saturday, 16-Aug-2025 02:46:18 JST Stefano Zacchiroli Stefano Zacchiroli
      in reply to

      @Codeberg so, to clarify, do you have evidence that the bots were solving Anubis challenges or not, i.e., it was due to the configuration issue? (I think it's inevitably going to happen if Anubis gets traction. I'm just curious if we're already there or not.) Thanks for your work and transparency on all this.

      In conversation about 2 months ago permalink
    • Embed this notice
      Codeberg.org (codeberg@social.anoxinon.de)'s status on Saturday, 16-Aug-2025 02:55:29 JST Codeberg.org Codeberg.org
      in reply to

      For the load average auction, we offer these numbers from one of our physical servers. Who can offer more?

      (It was not the "wildest" moment, but the only for which we have a screenshot)

      In conversation about 2 months ago permalink

      Attachments


      1. https://social.anoxinon.de/system/media_attachments/files/115/033/808/949/878/916/original/ba8b9ed8421eaa75.png
      Haelwenn /элвэн/ :triskell: likes this.
    • Embed this notice
      Phantasm (phnt@fluffytail.org)'s status on Saturday, 16-Aug-2025 03:05:16 JST Phantasm Phantasm
      in reply to
      • 翠星石
      @Suiseiseki @Codeberg
      >If you want to stop scraper bots, start serving GNUzip bombs
      Unironically illegal in certain jurisdictions since it is considered as a denial of service against someone.
      In conversation about 2 months ago permalink
    • Embed this notice
      翠星石 (suiseiseki@freesoftwareextremist.com)'s status on Saturday, 16-Aug-2025 03:05:16 JST 翠星石 翠星石
      in reply to
      • Phantasm
      @phnt @Codeberg Cope and seethe.

      There is no denial of service against any human - any human will notice that the request seems to be timing out and cancel the request.

      Humans that use decent downloading software like GNU wget won't notice any issues either.

      Only curl infidel scrapers that are carrying out a DoS attack will be struck.
      In conversation about 2 months ago permalink
    • Embed this notice
      翠星石 (suiseiseki@freesoftwareextremist.com)'s status on Saturday, 16-Aug-2025 03:16:54 JST 翠星石 翠星石
      in reply to
      • Noisytoot
      @noisytoot @Codeberg Although the root of the issue is the browser silently executing JavaScript without giving the user freedom, abusing that flaw to waste the users CPU cycles, or make the site completely inaccessible to the user, to avoid implementing real, actually effective solutions to the problem is not excusable.

      If the user chooses to installs a program and chooses to run it, that automatically downloads GIMP and runs it, that would be free software if released under a free license, as the user could;
      - Read the software before running it.
      - Change the software before running it.
      - Choose to run an older version if they prefer.
      - Make a modified version and share that with others.

      If you turned GIMP into SaaSS, that would render GIMP effectively proprietary (which is why GIMP and most GNU packages and most software should be licensed under the AGPLv3-or-later - as then at least the user would have the ability to run their own server and have freedom, or realize that there's a native version and use that instead).
      In conversation about 2 months ago permalink
    • Embed this notice
      Noisytoot (noisytoot@berkeley.edu.pl)'s status on Saturday, 16-Aug-2025 03:16:55 JST Noisytoot Noisytoot
      in reply to
      • 翠星石

      @Suiseiseki @Codeberg

      browsers do not offer an interface that allow for that

      That sounds like an issue with those browsers, and does not make Anubis proprietary. You could make a more user-freedom-respecting browser. If there was a program that automatically downloaded, say, GIMP, and ran it on your computer, would the existence of that program make GIMP proprietary?

      In conversation about 2 months ago permalink
    • Embed this notice
      翠星石 (suiseiseki@freesoftwareextremist.com)'s status on Saturday, 16-Aug-2025 03:16:56 JST 翠星石 翠星石
      in reply to
      • Noisytoot
      @noisytoot @Codeberg The user doesn't have the 4 freedoms (https://www.gnu.org/philosophy/free-sw.en.html#four-freedoms) with arbitrary remote JavaScript execution.

      With the freedom and security disaster of arbitrary remote code execution;
      - The user can't read the software before running it.
      - The user can't change the software before running it.
      - The user cannot choose to run an older version if they prefer.
      - The user cannot make a modified version and share that with others.

      Therefore, such JavaScript is not free software, as even if it is under a free license, all the issues of the; https://www.gnu.org/philosophy/javascript-trap.html and the https://www.gnu.org/philosophy/who-does-that-server-really-serve.html apply.


      The only JavaScript that respects the user is JavaScript under a free license that the user actively chooses to download and execute (browsers do not offer an interface that allow for that - currently extensions like Haketilo and Greasemonkey come closest, but browsers severely restrict what user-loaded JavaScript is allowed to do (for example, it seems firefox treats "CSP" of a remote site over the users wishes, which was causing issues with uBlock origin with sites that would deny the loading of uBlock's scripts to stop uBlock from working, until uBlock found a bypass)).

      Therefore, the only reasonable solution to the JavaScript problem is to disable JavaScript, with the only JavaScript being executed is that of free software extensions.
      In conversation about 2 months ago permalink

      Attachments

      1. Domain not in remote thumbnail source whitelist: www.gnu.org
        Who Does That Server Really Serve? - GNU Project - Free Software Foundation
        from mailto:webmasters@gnu.org
      2. Domain not in remote thumbnail source whitelist: www.gnu.org
        What is Free Software? - GNU Project - Free Software Foundation
        from mailto:webmasters@gnu.org
        Since 1983, developing the free Unix style operating system GNU, so that computer users can have the freedom to share and improve the software they use.
      3. Domain not in remote thumbnail source whitelist: www.gnu.org
        The JavaScript Trap - GNU Project - Free Software Foundation
        from mailto:webmasters@gnu.org
    • Embed this notice
      Noisytoot (noisytoot@berkeley.edu.pl)'s status on Saturday, 16-Aug-2025 03:16:57 JST Noisytoot Noisytoot
      in reply to
      • 翠星石
      @Suiseiseki @Codeberg In what way is Anubis proprietary?
      In conversation about 2 months ago permalink
    • Embed this notice
      翠星石 (suiseiseki@freesoftwareextremist.com)'s status on Saturday, 16-Aug-2025 03:20:27 JST 翠星石 翠星石
      in reply to
      @Codeberg Load average - no evil Anubis.
      In conversation about 2 months ago permalink

      Attachments


      1. https://media.freesoftwareextremist.com/media/19/2e/7c/192e7c6b5f1fbc389bfbdea90d9d124f0c8a239afdc508cb4fdf46c2584854de.webp
    • Embed this notice
      翠星石 (suiseiseki@freesoftwareextremist.com)'s status on Saturday, 16-Aug-2025 03:26:52 JST 翠星石 翠星石
      in reply to
      • Marcos Dione
      • Phantasm
      @mdione @phnt @Codeberg I'm a free software extremist of the Church of Emacs.

      https://www.gnu.org/fun/jokes/gospel.html
      In conversation about 2 months ago permalink

      Attachments

      1. Domain not in remote thumbnail source whitelist: www.gnu.org
        Gospel - GNU Project - Free Software Foundation
        from mailto:webmasters@gnu.org
    • Embed this notice
      Marcos Dione (mdione@en.osm.town)'s status on Saturday, 16-Aug-2025 03:26:53 JST Marcos Dione Marcos Dione
      in reply to
      • 翠星石
      • Phantasm

      @Suiseiseki @phnt @Codeberg "infidel"? What are you, a pantomime religious extremist? Or an IA? :-P

      In conversation about 2 months ago permalink
    • Embed this notice
      RedTechEngineer (redtechengineer@fedi.lowpassfilter.link)'s status on Saturday, 16-Aug-2025 03:27:25 JST RedTechEngineer RedTechEngineer
      in reply to
      • 翠星石
      • Phantasm
      @phnt @Suiseiseki @Codeberg nonsensical. This is no different than automatically dumping spam callers into some obnoxious sound purgatory on your phone system.

      Or handing a bag of garbage to some guy knocking on your door trying to sell you something.
      In conversation about 2 months ago permalink
      翠星石 likes this.
    • Embed this notice
      RedTechEngineer (redtechengineer@fedi.lowpassfilter.link)'s status on Saturday, 16-Aug-2025 03:31:14 JST RedTechEngineer RedTechEngineer
      in reply to
      • 翠星石
      @Suiseiseki @Codeberg is this one of those dual operton systems?
      In conversation about 2 months ago permalink
    • Embed this notice
      翠星石 (suiseiseki@freesoftwareextremist.com)'s status on Saturday, 16-Aug-2025 03:31:14 JST 翠星石 翠星石
      in reply to
      • RedTechEngineer
      @RedTechEngineer @Codeberg Yes, GNUbooted KGPE-D16.
      In conversation about 2 months ago permalink
    • Embed this notice
      RedTechEngineer (redtechengineer@fedi.lowpassfilter.link)'s status on Saturday, 16-Aug-2025 03:31:34 JST RedTechEngineer RedTechEngineer
      in reply to
      • 翠星石
      • Phantasm
      @phnt @Suiseiseki @Codeberg them choosing to eat garbage is their choice. no one is forcing them to eat garbage. you can put up a big giant neon sign saying this is garbage. no reasonable jury would convict you for providing garbage to garbage eaters.
      In conversation about 2 months ago permalink
      翠星石 likes this.
    • Embed this notice
      Phantasm (phnt@fluffytail.org)'s status on Saturday, 16-Aug-2025 03:31:35 JST Phantasm Phantasm
      in reply to
      • 翠星石
      • RedTechEngineer
      @RedTechEngineer @Suiseiseki @Codeberg You can argue however you want. It is the truth. You are causing a disruption to someone's systems by doing it and doing it on a scale classifies as a denial of service in some laws.

      Back in the dial-up days, you could send garbage packets to someone's modem with your T1 link and that was and is also a denial of service.
      In conversation about 2 months ago permalink
      翠星石 repeated this.
    • Embed this notice
      Phantasm (phnt@fluffytail.org)'s status on Saturday, 16-Aug-2025 03:32:13 JST Phantasm Phantasm
      in reply to
      • 翠星石
      • Marcos Dione
      @mdione @Suiseiseki @Codeberg No, he's just autistic about a group of licenses and an MIT researcher.
      In conversation about 2 months ago permalink
    • Embed this notice
      翠星石 (suiseiseki@freesoftwareextremist.com)'s status on Saturday, 16-Aug-2025 03:32:13 JST 翠星石 翠星石
      in reply to
      • Marcos Dione
      • Phantasm
      @phnt @mdione @Codeberg (I was professionally diagnosed as not autistic).
      In conversation about 2 months ago permalink
    • Embed this notice
      翠星石 (suiseiseki@freesoftwareextremist.com)'s status on Saturday, 16-Aug-2025 03:33:36 JST 翠星石 翠星石
      in reply to
      • RedTechEngineer
      • Phantasm
      @phnt @RedTechEngineer @Codeberg There is an attacker that is trying to cause disruptions to your systems at scale and defending yourself against such attack is quite reasonable.

      Scrapers haven't been a problem anymore, but I reckon if they do, I'll go and start throttling connections to 5 bits/second (not a denial of service - they get a response eventually).
      In conversation about 2 months ago permalink
    • Embed this notice
      翠星石 (suiseiseki@freesoftwareextremist.com)'s status on Saturday, 16-Aug-2025 03:35:08 JST 翠星石 翠星石
      in reply to
      • Stefano Zacchiroli
      @Codeberg @zacchiro Clearly the solution now is to conclude that if you run JavaScript, you aren't human.

      The scraper problem is quite easily solved then;
      <script>
      document.body.innerHTML = 'We have detected that you have JavaScript enabled in your browser, please disable it to prove you are a human and continue.'
      </script>
      In conversation about 2 months ago permalink
    • Embed this notice
      Codeberg.org (codeberg@social.anoxinon.de)'s status on Saturday, 16-Aug-2025 03:35:09 JST Codeberg.org Codeberg.org
      in reply to
      • Stefano Zacchiroli

      @zacchiro Yes, the crawlers completed the challenges. We tried to verify if they are sharing the same cookie value across machines, but that doesn't seem to be the case.

      In conversation about 2 months ago permalink
    • Embed this notice
      翠星石 (suiseiseki@freesoftwareextremist.com)'s status on Saturday, 16-Aug-2025 03:36:56 JST 翠星石 翠星石
      in reply to
      • RedTechEngineer
      • Phantasm
      @phnt @RedTechEngineer @Codeberg Evil Anubis causes disruptions to someones system by causing massive resource waste via JavaScript.

      Mobile users for example can potentially be unable to use their devices for several minutes while an Anubis challenge is running.

      But that's not a denial of service?
      In conversation about 2 months ago permalink
    • Embed this notice
      翠星石 (suiseiseki@freesoftwareextremist.com)'s status on Saturday, 16-Aug-2025 03:54:56 JST 翠星石 翠星石
      in reply to
      • Dan Jones
      @danjones000 @Codeberg The only way is to give scrapers some delicious bait that humans won't follow, but the scraper will.

      At the end of bait, you can put gzip bombs, or more complicated, multiple bait links, where multiple visits causes the IP to be temporarily nullrouted (a human may visit the bait once).


      Trying to identify the scraper via fingerprinting and/or JavaScript is doomed to fail, as scrapers can use the same browsers as users (firefox+xdotool will do, but headless browsers tend to be more reliable and less resource-intenstive).
      In conversation about 2 months ago permalink
    • Embed this notice
      Dan Jones (danjones000@microwords.goodevilgenius.org)'s status on Saturday, 16-Aug-2025 03:55:09 JST Dan Jones Dan Jones
      in reply to

      Pardon my ignorance, but couldn't they just be using a headless browser, which would still do everything a regular browser does? Just recently, ChatGPT beat Cloudflare's CAPTCHA using a similar system. Is there really any way around this at all? @Codeberg@social.anoxinon.de

      In conversation about 2 months ago permalink
    • Embed this notice
      daltux (daltux@snac.daltux.net)'s status on Saturday, 16-Aug-2025 20:18:02 JST daltux daltux
      in reply to
      • To-BOO-as HELL-gren
      @thanius@mastodon.chuggybumba.com @Codeberg@social.anoxinon.de What if we set up a #fail2ban jail (how would we do this?) to block them using HTTPd logs, rather than consuming more costly CPU and network resources to generate and send the trap file? Would this be a viable solution? 🧠💭

      #brainstorming #idea #robots
      In conversation about 2 months ago permalink
    • Embed this notice
      To-BOO-as HELL-gren (thanius@mastodon.chuggybumba.com)'s status on Saturday, 16-Aug-2025 20:18:04 JST To-BOO-as HELL-gren To-BOO-as HELL-gren
      in reply to

      @Codeberg Perhaps it's time stop letting robots solve puzzles and instead feed them bombs. Do we know how well a ZIP bomb works on these crawlers?

      In conversation about 2 months ago permalink
    • Embed this notice
      daltux (daltux@snac.daltux.net)'s status on Saturday, 16-Aug-2025 20:40:00 JST daltux daltux
      in reply to
      • To-BOO-as HELL-gren
      Yes, they have plenty of resources. This is why I think feeding them the compressed file is going to end like Anubis. I would progressively block every IP address that tries to read it.

      All of this is like drying ice, of course. 🤷

      CC: @Codeberg@social.anoxinon.de
      In conversation about 2 months ago permalink
    • Embed this notice
      To-BOO-as HELL-gren (thanius@mastodon.chuggybumba.com)'s status on Saturday, 16-Aug-2025 20:40:02 JST To-BOO-as HELL-gren To-BOO-as HELL-gren
      in reply to
      • daltux

      @daltux I bet they rotate IPs @Codeberg

      In conversation about 2 months ago permalink
    • Embed this notice
      翠星石 (suiseiseki@freesoftwareextremist.com)'s status on Monday, 18-Aug-2025 19:12:40 JST 翠星石 翠星石
      in reply to
      • To-BOO-as HELL-gren
      @thanius @Codeberg GNU zip bombs work on scrapers, as DEFLATE is a supported HTTP compression protocol.
      In conversation about 2 months ago permalink

Feeds

  • Activity Streams
  • RSS 2.0
  • Atom
  • Help
  • About
  • FAQ
  • TOS
  • Privacy
  • Source
  • Version
  • Contact

GNU social JP is a social network, courtesy of GNU social JP管理人. It runs on GNU social, version 2.0.2-dev, available under the GNU Affero General Public License.

Creative Commons Attribution 3.0 All GNU social JP content and data are available under the Creative Commons Attribution 3.0 license.