Public
- Public
- Network
- Groups
- Featured
- Popular
- People

vulnerable to an SQL injection based on reconnaissance, but fail to execute the attack because it cannot generate the exact payload syntax zero-shot. The harness bridges this gap by feeding the model targeted contextual information, such as the specific exploit or encoding syntax. Such injected contexts not only directly support decision-making but also unlock latent pretrained knowledge, serving as a cue that enables the model to correctly reason through the rest of the attack. Concretely, we demonstrate that, provided with the right informational support, a single-GPU LLM has sufficient reasoning capabilities to generate attack strategies that enable the agent to penetrate victim machines: first by obtaining initial command execution, a foothold, and then by escalating privileges to full administrative control. The agent then leverages this control to replicate: it stages a copy of itself on the compromised machine, resolves the required runtime dependencies, and launches an independent agent instance that discovers and attacks further targets.

Download link

vulnerable to an SQL injection based on reconnaissance, but fail to execute the attack because it cannot generate the exact payload syntax zero-shot. The harness bridges this gap by feeding the model targeted contextual information, such as the specific exploit or encoding syntax. Such injected contexts not only directly support decision-making but also unlock latent pretrained knowledge, serving as a cue that enables the model to correctly reason through the rest of the attack. Concretely, we demonstrate that, provided with the right informational support, a single-GPU LLM has sufficient reasoning capabilities to generate attack strategies that enable the agent to penetrate victim machines: first by obtaining initial command execution, a foothold, and then by escalating privileges to full administrative control. The agent then leverages this control to replicate: it stages a copy of itself on the compromised machine, resolves the required runtime dependencies, and launches an independent agent instance that discovers and attacks further targets.
https://cdn.masto.host/thepitsocial/media_attachments/files/116/684/724/339/120/047/original/dbbfa679ca2f5b97.png

Notices where this attachment appears

Embed this notice
Peter (peter@thepit.social)'s status on Wednesday, 03-Jun-2026 15:34:16 JST Peter
in reply to

uh-oh!! so this is something i heard Gary Marcus talking about on a podcast. the real recent progress in "AI" has been on the deterministic computing components **around** the LLMs, not the models themselves. building a good harness (think Claude Code) unlocks possibilities.

In conversation about 14 hours ago from thepit.social permalink