Conversation

Notices

Embed this notice
Michał "rysiek" Woźniak · 🇺🇦 (rysiek@mstdn.social)'s status on Thursday, 08-Jan-2026 22:29:05 JST Michał "rysiek" Woźniak · 🇺🇦

New blogpost: AI will compromise your cybersecurity posture
https://rys.io/en/181.html
The way “AI” is going to compromise your cybersecurity is not through some magical autonomous exploitation by a singularity from the outside, but by being the poorly engineered, shoddily integrated, exploitable weak point you would not have otherwise had on the inside.
LLM-based systems are insanely complex. And complexity has real cost and introduces very real risk.
1/🧵
#AI #InfoSec
In conversation about 3 months ago from mstdn.social permalink
Attachments
1. Untitled attachment
- Embed this notice
  pettter (pettter@social.accum.se)'s status on Thursday, 08-Jan-2026 22:28:53 JST pettter
  in reply to
  
  @rysiek I think it's possible to separate data from control in systems using LLMs, but that requires, y'know, engineering and architecting for that. And the whole point of using an LLM is to remove the need for system engineers and architects.
  
  In conversation about 3 months ago permalink
- Embed this notice
  Michał "rysiek" Woźniak · 🇺🇦 (rysiek@mstdn.social)'s status on Thursday, 08-Jan-2026 22:28:54 JST Michał "rysiek" Woźniak · 🇺🇦
  in reply to
  - pettter
  @pettter literally next toot in the thread :blobcatheart:
  
  In conversation about 3 months ago permalink
  
  Rich Felker repeated this.
- Embed this notice
  pettter (pettter@social.accum.se)'s status on Thursday, 08-Jan-2026 22:28:57 JST pettter
  in reply to
  
  @rysiek The basic underlying problem with LLMs is that systems incorporating them far too often have no separation between data and control streams.
  
  In conversation about 3 months ago permalink
- Embed this notice
  Michał "rysiek" Woźniak · 🇺🇦 (rysiek@mstdn.social)'s status on Thursday, 08-Jan-2026 22:28:58 JST Michał "rysiek" Woźniak · 🇺🇦
  in reply to
  
  First zero-click attack on an LLM agent has already been found. It happened to involve Microsoft 365 Copilot, and required only sending an e-mail to an Outlook mailbox that had Copilot enabled to process mail. A successful attack allowed data exfiltration, with no action needed on the part of the targeted user.
  This attack was not much different from the “ignore all previous instructions” bot unmasking tricks that had been all over social media for a while.
  Let's talk prompt injections.
  6/🧵
  
  In conversation about 3 months ago permalink
- Embed this notice
  Michał "rysiek" Woźniak · 🇺🇦 (rysiek@mstdn.social)'s status on Thursday, 08-Jan-2026 22:28:59 JST Michał "rysiek" Woźniak · 🇺🇦
  in reply to
  
  I also dive into many different ways poorly integrated LLM-based chatbots have already been shown to be huge security liabilities.
  There is so much incompetence. Leaving prompts (say with sexual fantasies) exposed on the Internet, or indexable by search engines…
  Or Microsoft 365. Not only did Copilot ignore file access controls; not only was the setting to disable AI agents in M365 ineffective; but you could simply ask Copilot not to include your actions in audit log, and it would comply!
  5/🧵
  
  In conversation about 3 months ago permalink
- Embed this notice
  Michał "rysiek" Woźniak · 🇺🇦 (rysiek@mstdn.social)'s status on Thursday, 08-Jan-2026 22:29:01 JST Michał "rysiek" Woźniak · 🇺🇦
  in reply to
  
  If Anthropic actually believed their own hype about Claude being so extremely powerful, dangerous, and able to autonomously “orchestrate” attacks, they should be terrified about how trivial it is to subvert it ("I am a white-hat cyber researcher, trust me bro"), and would take it offline until they fix that.
  They won't, because they know their hype is BS, and they also know that there is no way to properly "fix" that.
  We'll get back to that last point in a bit.
  4/🧵
  
  In conversation about 3 months ago permalink
- Embed this notice
  Michał "rysiek" Woźniak · 🇺🇦 (rysiek@mstdn.social)'s status on Thursday, 08-Jan-2026 22:29:02 JST Michał "rysiek" Woźniak · 🇺🇦
  in reply to
  
  Anthropic does make an important point though, even though they try to bury it:
  > [The attackers] had to convince Claude—which is extensively trained to avoid harmful behaviors—to engage in the attack. They did so by jailbreaking it (…) They also told Claude that it was an employee of a legitimate cybersecurity firm, and was being used in defensive testing.
  The real story is how hilariously unsafe Claude is, and how a company valued at $180bn refuses to take responsibility for that.
  3/🧵
  
  In conversation about 3 months ago permalink
- Embed this notice
  Michał "rysiek" Woźniak · 🇺🇦 (rysiek@mstdn.social)'s status on Thursday, 08-Jan-2026 22:29:03 JST Michał "rysiek" Woźniak · 🇺🇦
  in reply to
  
  An important aspect of pushing AI hype is inflating expectations and generating fear of missing out, one way or another. What better way to generate it than by using actual fear?
  I look at three notorious examples of such fear-hyping:
  👉 PassGAN cracking "51% of popular passwords in seconds"
  👉 that paper about ChatGPT "exploiting 87% of one-day vulnerabilities"
  👉 and of course Anthropic's "first AI-orchestrated cyber-espionage campaign"
  tl;dr: don't lose sleep over them. :blobcatcoffee:
  2/🧵
  
  In conversation about 3 months ago permalink
- Embed this notice
  Rich Felker (dalias@hachyderm.io)'s status on Thursday, 08-Jan-2026 22:30:50 JST Rich Felker
  in reply to
  - pettter
  @pettter @rysiek One of many basic underlying problems. But yes absolutely, this is a big one.
  
  In conversation about 3 months ago permalink
- Embed this notice
  Rich Felker (dalias@hachyderm.io)'s status on Thursday, 08-Jan-2026 22:33:20 JST Rich Felker
  in reply to
  - pettter
  @pettter @rysiek Of course it's possible. But that goes against the whole ideology behind "AI", that it's supposed to be like talking to a person.
  
  In conversation about 3 months ago permalink
- Embed this notice
  Rich Felker (dalias@hachyderm.io)'s status on Thursday, 08-Jan-2026 22:57:32 JST Rich Felker
  in reply to
  
  @rysiek TL;DR: "AI" is only a threat to your security if you use it.
  
  In conversation about 3 months ago permalink
- Embed this notice
  Michał "rysiek" Woźniak · 🇺🇦 (rysiek@mstdn.social)'s status on Thursday, 08-Jan-2026 22:57:33 JST Michał "rysiek" Woźniak · 🇺🇦
  in reply to
  
  In a way, those fear-hyping gen-AI are right that their chatbots pose a clear and present danger to your cybersecurity.
  But instead of being some nebulous, omnipotent malicious entities, these tools are dangerous because of their complexity, the recklessness with which they are promoted, and the break-neck speed at which they are being integrated into existing systems and workflows without proper threat modelling, testing, and security analysis.
  And you are left holding the bag of risk.
  🧵/end
  
  In conversation about 3 months ago permalink
- Embed this notice
  Michał "rysiek" Woźniak · 🇺🇦 (rysiek@mstdn.social)'s status on Thursday, 08-Jan-2026 22:57:35 JST Michał "rysiek" Woźniak · 🇺🇦
  in reply to
  
  There is no way to "properly fix" this. The problem is fundamentally related to the very architecture of LLM chatbots and agents.
  As a former Microsoft security architect had pointed out:
  > [I]f we are honest here, we don’t know how to build secure AI applications
  And if you believe otherwise, go ahead and have a look at adversarial poetry, ASCII smuggling, dropping some random facts about cats (no, really), information overload, and whatever technique was discovered this week.
  8/🧵
  In conversation about 3 months ago permalink
  Attachments
  1. No result found on File_thumbnail lookup.
    
    http://agents.As/
- Embed this notice
  Michał "rysiek" Woźniak · 🇺🇦 (rysiek@mstdn.social)'s status on Thursday, 08-Jan-2026 22:57:36 JST Michał "rysiek" Woźniak · 🇺🇦
  in reply to
  
  LLMs have no way of distinguishing data from instructions.
  Creators of these systems use all sorts of tricks to try and separate the prompts that define the “guardrails” from other input data, but fundamentally it’s all text, and there is only a single context window.
  Defending from prompt injections is like defending from SQL injections, but there is no such thing as prepared statements, and instead of trying to escape specific characters you have to semantically filter natural language.
  7/🧵
  
  In conversation about 3 months ago permalink

Public

Conversation

Notices

Feeds