@patrickcmiller They left out the authorisation model. All #AI systems I have seen have a binary authorisation model: an entity is allowed to inference against the model, or not. Contrast with relational databases where you can have access to some tables and not others. We can even get to row-level and column-level access controls. Just because you can query the database doesn’t mean the whole of the dataset is available to you. Data in the database that matches your query might be missing from your response because you don’t have access to those items.
With an #LLM the entire trained model is available for inference. To put it in #RBAC terms, every distinct role with distinct access to subsets of data would need its own model, trained only on the data they’re allowed to access.
In practice no one does that. So models either include too much data, risking exposure to unauthorised users, or they omit useful data in training because they don’t want the risk. Middle ground solutions are rare and difficult.