I can’t remember if I posted this here before but, I case I didn’t:
LLMs are the new memory-safety bugs.
The reason that memory-safety bugs are so bad is not that they’re common (they are, but they’d still be bad if we fixed 90% of them), it’s that they step outside of the language abstract machine. When a memory-safety bug occurs, the program will do something completely unpredictable. You can’t reason at the source level about what will happen. Some piece of unrelated state will me modified or used as input to some calculation.
This is how LLMs work by design. This is not a bug. They arrange data in an n-dimensional latent space and will give outputs that are nearby in that space, but you have no way of articulating the shape of that space in anything the resembles source code. If you ask a question about a topic, the latent space may contain nearby replies that include knowledge of that topic, or the gaps may have been painted in with something totally unrelated.