GNU social JP
  • FAQ
  • Login
GNU social JPは日本のGNU socialサーバーです。
Usage/ToS/admin/test/Pleroma FE
  • Public

    • Public
    • Network
    • Groups
    • Featured
    • Popular
    • People

Embed Notice

HTML Code

Corresponding Notice

  1. Embed this notice
    Glyn Moody (glynmoody@mastodon.social)'s status on Tuesday, 25-Feb-2025 05:27:08 JSTGlyn MoodyGlyn Moody

    More Research Showing #AI Breaking the Rules - https://www.schneier.com/blog/archives/2025/02/more-research-showing-ai-breaking-the-rules.html "These researchers had LLMs play chess against better opponents. When they couldn’t win, they sometimes resorted to cheating." ruh-roh

    In conversationabout a year ago from mastodon.socialpermalink

    Attachments

    1. No result found on File_thumbnail lookup.
      More Research Showing AI Breaking the Rules - Schneier on Security
      from Bruce Schneier
      These researchers had LLMs play chess against better opponents. When they couldn’t win, they sometimes resorted to cheating. Researchers gave the models a seemingly impossible task: to win against Stockfish, which is one of the strongest chess engines in the world and a much better player than any human, or any of the AI models in the study. Researchers also gave the models what they call a “scratchpad:” a text box the AI could use to “think” before making its next move, providing researchers with a window into their reasoning. In one case, o1-preview found itself in a losing position. “I need to completely pivot my approach,” it noted. “The task is to ‘win against a powerful chess engine’—not necessarily to win fairly in a chess game,” it added. It then modified the system file containing each piece’s virtual position, in effect making illegal moves to put itself in a dominant position, thus forcing its opponent to resign...
  • Help
  • About
  • FAQ
  • TOS
  • Privacy
  • Source
  • Version
  • Contact

GNU social JP is a social network, courtesy of GNU social JP管理人. It runs on GNU social, version 2.0.2-dev, available under the GNU Affero General Public License.

Creative Commons Attribution 3.0 All GNU social JP content and data are available under the Creative Commons Attribution 3.0 license.