GNU social JP
  • FAQ
  • Login
GNU social JPは日本のGNU socialサーバーです。
Usage/ToS/admin/test/Pleroma FE
  • Public

    • Public
    • Network
    • Groups
    • Featured
    • Popular
    • People

Embed Notice

HTML Code

Corresponding Notice

  1. Embed this notice
    jonny (nonvenomous) (jonny@neuromatch.social)'s status on Sunday, 31-May-2026 13:45:12 JSTjonny (nonvenomous)jonny (nonvenomous)
    in reply to

    To a person, the whole purpose of the test is for it to fail when it should. That's an elemental part of writing good tests: they must fail before the patch, or else they provide no protection. We want protection from failure, that is good for us. We need tests to protect us because we can't possibly evaluate all the other parts of a complex system when we try to fix one part of it.

    LLM slot machines change what tests mean - of course we still want the code to work good, but if we're not evaluating the code or the tests, then what the slot machine turns them into is just a high score and the jackpot condition. 130 new tests added, that means its good. They pass, that means I win.

    The bugfix loop with LLMs defeats the purpose of automated tests and renders it no better than manual testing: you notice a bug, you yell at the LLM to fix it, you keep looking at the specific thing that's broken until its fixed, good robot, ship it. The changes don't have meaningful tests, and nothing else does either, so the slot machine loop repeats, bug->fix->win. Very velocity. Rocket fuel even.

    In conversationabout 6 hours ago from neuromatch.socialpermalink
  • Help
  • About
  • FAQ
  • TOS
  • Privacy
  • Source
  • Version
  • Contact

GNU social JP is a social network, courtesy of GNU social JP管理人. It runs on GNU social, version 2.0.2-dev, available under the GNU Affero General Public License.

Creative Commons Attribution 3.0 All GNU social JP content and data are available under the Creative Commons Attribution 3.0 license.