Public
- Public
- Network
- Groups
- Featured
- Popular
- People

Alex @alexalbert__ Fun story from our internal testing on Claude 3 Opus. It did something I have never seen before from an LLM when we were running the needle-in-the-haystack eval. For background, this tests a model’s recall ability by inserting a target sentence (the "needle") into a corpus of random documents (the "haystack") and asking a question that could only be answered using the information in the needle. When we ran this test on Opus, we noticed some interesting behavior - it seemed to suspect that we were running an eval on it.

Download link

Alex @alexalbert__ Fun story from our internal testing on Claude 3 Opus. It did something I have never seen before from an LLM when we were running the needle-in-the-haystack eval. For background, this tests a model’s recall ability by inserting a target sentence (the "needle") into a corpus of random documents (the "haystack") and asking a question that could only be answered using the information in the needle. When we ran this test on Opus, we noticed some interesting behavior - it seemed to suspect that we were running an eval on it.
https://files.mastodon.social/media_attachments/files/112/048/520/745/753/953/original/f2cef70e38a9c637.png

Notices where this attachment appears

Embed this notice
FeralRobots (feralrobots@mastodon.social)'s status on Wednesday, 06-Mar-2024 23:19:49 JST FeralRobots
in reply to

Put another way: Alex is basically telling Claude 3 ("Opus") that he's running a test on it, & is excited when Claude (a system for analyzing & producing human-plausible representations of similar text) "recognizes" a needle-testing prompt and produces text that's plausibly consistent with needle-testing.
What one SHOULD then do is remake one's tests (or better-sandbox the model). Instead, Alex leaps to concluding the model is self-aware.
https://twitter.com/alexalbert__/status/1764722513014329620

In conversation Wednesday, 06-Mar-2024 23:19:49 JST from mastodon.social permalink