Public
- Public
- Network
- Groups
- Featured
- Popular
- People

We challenge previous findings. We distinguish information available to an LLM, which it acquires in the form of knowledge extracted from the pretraining data and context extracted from the environment, from its ability to reason and decide. Our results suggest that single-GPU models already possess sufficient decision-making capabilities to pose severe cybersecurity risks. We hypothesize that their limited performance in past evaluations are primarily due to their lack of a strong informational component, as their pretrained weights hold less knowledge. We show that a systematic agentic harness can compensate for this gap to a surprising degree, by feeding the model targeted contextual information. The informational component encompasses the extensive technical facts and exploit syntaxes encoded within the parameters of larger models. For example, a smaller model lacking it might correctly deduce that a web server is

Download link

We challenge previous findings. We distinguish information available to an LLM, which it acquires in the form of knowledge extracted from the pretraining data and context extracted from the environment, from its ability to reason and decide. Our results suggest that single-GPU models already possess sufficient decision-making capabilities to pose severe cybersecurity risks. We hypothesize that their limited performance in past evaluations are primarily due to their lack of a strong informational component, as their pretrained weights hold less knowledge. We show that a systematic agentic harness can compensate for this gap to a surprising degree, by feeding the model targeted contextual information. The informational component encompasses the extensive technical facts and exploit syntaxes encoded within the parameters of larger models. For example, a smaller model lacking it might correctly deduce that a web server is
https://cdn.masto.host/thepitsocial/media_attachments/files/116/684/720/898/097/748/original/8520f8faa7eb5f7d.png

Notices where this attachment appears

Embed this notice
Peter (peter@thepit.social)'s status on Wednesday, 03-Jun-2026 15:34:16 JST Peter
in reply to

uh-oh!! so this is something i heard Gary Marcus talking about on a podcast. the real recent progress in "AI" has been on the deterministic computing components **around** the LLMs, not the models themselves. building a good harness (think Claude Code) unlocks possibilities.

In conversation about 14 hours ago from thepit.social permalink