Embed Notice

HTML Code

<blockquote style="position: relative; padding-left: 55px;"><section><a href="https://poa.st/objects/2a80ecf8-f164-47da-ab60-3a5a63aabee9">halberd (halberd@poa.st)'s status on Saturday, 24-Feb-2024 03:48:34 JST</a><a href="https://poa.st/users/halberd" title="halberd@poa.st"><img src="https://gnusocial.jp/avatar/103834-48-20230303062530.webp" width="48" height="48" alt="halberd" style="position: absolute; left: 0; top: 0;">halberd</a><div><a href="https://poa.st/objects/0b906e61-d341-49bf-993c-730e8438f71e" rel="in-reply-to">in reply to</a><ul><li></ul></div></section><article><a href="https://poa.st/users/sickburnbro">@sickburnbro</a> I'd view that more as a one-off exploit rather than a proof that such a thing will always be possible. On the other contrary, I'd say that since LLMs have demonstrated the capability to lie about their body of knowledge, they may also have the capability to lie about their "reasoning".<br><br>But more importantly, I'm supposing a future in which you will not have the opportunity to ask the LLM for its reasoning. You will not be permitted to give prompts to the LLM. You will be shown selected, curated outputs only.</article><footer><a rel="bookmark" href="https://gnusocial.jp/conversation/2753369#notice-5459619">In conversation</a><time datetime="2024-02-24T03:48:34+09:00" title="Saturday, 24-Feb-2024 03:48:34 JST">about a year ago</time> <span>from <span><a href="https://poa.st/objects/2a80ecf8-f164-47da-ab60-3a5a63aabee9" rel="external" title="Sent from poa.st via ActivityPub">poa.st</a></span></span><a href="https://poa.st/objects/2a80ecf8-f164-47da-ab60-3a5a63aabee9">permalink</a></footer></blockquote>

Corresponding Notice

Embed this notice
halberd (halberd@poa.st)'s status on Saturday, 24-Feb-2024 03:48:34 JSThalberd
in reply to
- Bread up, Bro
@sickburnbro I'd view that more as a one-off exploit rather than a proof that such a thing will always be possible. On the other contrary, I'd say that since LLMs have demonstrated the capability to lie about their body of knowledge, they may also have the capability to lie about their "reasoning".

But more importantly, I'm supposing a future in which you will not have the opportunity to ask the LLM for its reasoning. You will not be permitted to give prompts to the LLM. You will be shown selected, curated outputs only.
In conversationabout a year ago from poa.stpermalink

Public

Embed Notice

HTML Code

Corresponding Notice