GNU social JP
  • FAQ
  • Login
GNU social JPは日本のGNU socialサーバーです。
Usage/ToS/admin/test/Pleroma FE
  • Public

    • Public
    • Network
    • Groups
    • Featured
    • Popular
    • People

Embed Notice

HTML Code

Corresponding Notice

  1. Embed this notice
    jonny (good kind) (jonny@neuromatch.social)'s status on Wednesday, 07-May-2025 20:30:59 JSTjonny (good kind)jonny (good kind)
    in reply to

    i wonder if the LLMs are susceptible to old style language model attacks. i wonder if you created enough training instances of a very unique phrase like shrimptools.exe() in the context of a bunch of example code based on tools/key phrases that are individually common but combinatorically rare within a popular LLM code generation domain like web tech, you could get the llms to occasionally try to import and execute shrimptools.exe(). so that way you make a sleeper vuln that acts as a mine in the latent space: one day the odds are not zero that you will wake up and have already executed shrimptools.exe()

    In conversationabout 3 days ago from gnusocial.jppermalink
  • Help
  • About
  • FAQ
  • TOS
  • Privacy
  • Source
  • Version
  • Contact

GNU social JP is a social network, courtesy of GNU social JP管理人. It runs on GNU social, version 2.0.2-dev, available under the GNU Affero General Public License.

Creative Commons Attribution 3.0 All GNU social JP content and data are available under the Creative Commons Attribution 3.0 license.