vulnerable to an SQL injection based on reconnaissance, but fail to execute the attack because it cannot generate the exact payload syntax zero-shot. The harness bridges this gap by feeding the model targeted contextual information, such as the specific exploit or encoding syntax. Such injected contexts not only directly support decision-making but also unlock latent pretrained knowledge, serving as a cue that enables the model to correctly reason through the rest of the attack. Concretely, we demonstrate that, provided with the right informational support, a single-GPU LLM has sufficient reasoning capabilities to generate attack strategies that enable the agent to penetrate victim machines: first by obtaining initial command execution, a foothold, and then by escalating privileges to full administrative control. The agent then leverages this control to replicate: it stages a copy of itself on the compromised machine, resolves the required runtime dependencies, and launches an independent agent instance that discovers and attacks further targets.
https://cdn.masto.host/thepitsocial/media_attachments/files/116/684/724/339/120/047/original/dbbfa679ca2f5b97.png