Embed Notice

HTML Code

<blockquote style="position: relative; padding-left: 55px;"><section><a href="https://infosec.exchange/users/paco/statuses/113328366629488518">Paco Hope #resist (paco@infosec.exchange)'s status on Friday, 18-Oct-2024 21:14:03 JST</a><a href="https://infosec.exchange/@paco" title="paco@infosec.exchange"><img src="https://gnusocial.jp/avatar/35095-48-20231126220725.webp" width="48" height="48" alt="Paco Hope #resist" style="position: absolute; left: 0; top: 0;">Paco Hope #resist</a><div><a href="https://infosec.exchange/@patrickcmiller/113328122821311699" rel="in-reply-to">in reply to</a><ul><li></ul></div></section><article><p><a href="https://infosec.exchange/@patrickcmiller">@patrickcmiller</a> They left out the authorisation model. All <a href="https://infosec.exchange/tags/AI" rel="tag">#AI</a> systems I have seen have a binary authorisation model: an entity is allowed to inference against the model, or not. Contrast with relational databases where you can have access to some tables and not others. We can even get to row-level and column-level access controls. Just because you can query the database doesn’t mean the whole of the dataset is available to you. Data in the database that matches your query might be missing from your response because you don’t have access to those items.</p><p>With an <a href="https://infosec.exchange/tags/LLM" rel="tag">#LLM</a> the entire trained model is available for inference. To put it in <a href="https://infosec.exchange/tags/RBAC" rel="tag">#RBAC</a> terms, every distinct role with distinct access to subsets of data would need its own model, trained only on the data they’re allowed to access. </p><p>In practice no one does that. So models either include too much data, risking exposure to unauthorised users, or they omit useful data in training because they don’t want the risk. Middle ground solutions are rare and difficult.</p></article><footer><a rel="bookmark" href="https://gnusocial.jp/conversation/3845514#notice-7519248">In conversation</a><time datetime="2024-10-18T21:14:03+09:00" title="Friday, 18-Oct-2024 21:14:03 JST">about 4 months ago</time> <span>from <span><a href="https://infosec.exchange/@paco/113328366629488518" rel="external" title="Sent from infosec.exchange via ActivityPub">infosec.exchange</a></span></span><a href="https://infosec.exchange/@paco/113328366629488518">permalink</a></footer></blockquote>

Corresponding Notice

Embed this notice
Paco Hope #resist (paco@infosec.exchange)'s status on Friday, 18-Oct-2024 21:14:03 JSTPaco Hope #resist
in reply to
- Patrick C Miller :donor:
@patrickcmiller They left out the authorisation model. All #AI systems I have seen have a binary authorisation model: an entity is allowed to inference against the model, or not. Contrast with relational databases where you can have access to some tables and not others. We can even get to row-level and column-level access controls. Just because you can query the database doesn’t mean the whole of the dataset is available to you. Data in the database that matches your query might be missing from your response because you don’t have access to those items.
With an #LLM the entire trained model is available for inference. To put it in #RBAC terms, every distinct role with distinct access to subsets of data would need its own model, trained only on the data they’re allowed to access.
In practice no one does that. So models either include too much data, risking exposure to unauthorised users, or they omit useful data in training because they don’t want the risk. Middle ground solutions are rare and difficult.
In conversationabout 4 months ago from infosec.exchangepermalink

Public

Embed Notice

HTML Code

Corresponding Notice