Notices by David Chisnall (Now with 50% more sarcasm!) (david_chisnall@infosec.exchange)

Embed this notice
David Chisnall (*Now with 50% more sarcasm!*) (david_chisnall@infosec.exchange)'s status on Friday, 01-Aug-2025 16:11:17 JST David Chisnall (*Now with 50% more sarcasm!*)

Who could possibly have predicted that a tool designed to create output that looks right, but which has no way of understanding if it actually is right, would lead to lots of difficult-to-spot errors?

In conversation about 2 days ago from infosec.exchange permalink
Embed this notice
David Chisnall (*Now with 50% more sarcasm!*) (david_chisnall@infosec.exchange)'s status on Friday, 01-Aug-2025 02:38:26 JST David Chisnall (*Now with 50% more sarcasm!*)

Wow, the petition to repeal the Online Safety Act is now at over 450K signatures (100K are required for a debate in Parliament). It went quite slowly to 100K, but then jumped rapidly once the act started being enforced.

In conversation about 2 days ago from infosec.exchange permalink
Embed this notice
David Chisnall (*Now with 50% more sarcasm!*) (david_chisnall@infosec.exchange)'s status on Wednesday, 30-Jul-2025 16:31:09 JST David Chisnall (*Now with 50% more sarcasm!*)
- CHERI Alliance
We’re starting to upstream CHERI support to LLVM!
#CHERI #CHERIoT @cheri_alliance

In conversation about 4 days ago from infosec.exchange permalink
Embed this notice
David Chisnall (*Now with 50% more sarcasm!*) (david_chisnall@infosec.exchange)'s status on Monday, 28-Jul-2025 12:15:59 JST David Chisnall (*Now with 50% more sarcasm!*)
in reply to
- feliks
@feliks
As I’ve said before, the difference between an LLM and a rubber duck is that the duck is smart enough to shut up when it has nothing useful to say.

In conversation about 6 days ago from infosec.exchange permalink
Embed this notice
David Chisnall (*Now with 50% more sarcasm!*) (david_chisnall@infosec.exchange)'s status on Sunday, 27-Jul-2025 04:17:10 JST David Chisnall (*Now with 50% more sarcasm!*)
in reply to
@whitequark @dizzy @BlueBee
Even without, a lot of things do type oblivious copies (especially after a modest amount of compiler optimisation). The best thing about LLVM moving to untyped pointers was that it reduced the diff for CHERI. The second best thing is I never again have to review a paper that claims to have done something involving type safety that trusts the (mostly nonsense) pointee type information in LLVM IR.
Making type-oblivious copies work was the very first change I made to the CHERI ISA way back in 2012 (the second was making stack spills not require stack spills).

In conversation about 7 days ago from gnusocial.jp permalink
Embed this notice
David Chisnall (*Now with 50% more sarcasm!*) (david_chisnall@infosec.exchange)'s status on Sunday, 27-Jul-2025 04:14:01 JST David Chisnall (*Now with 50% more sarcasm!*)
in reply to
@whitequark @dizzy @BlueBee
I haven’t tracked the GC extensions. The big issue with C and GC is that C and C++ encourage data structures where pointers cast to integers can be used as identities. Trees using pointer comparison to identify keys, hash tables using hashes of pointer values as keys, and so on. This is incompatible with any kind of copying GC, because copying means that keys are no longer in the right places in the data structures.
I had a student implement a copying GC with CHERI and it worked really nicely for simple examples and then failed impressively for more complex ones. My first approach at merging the capability and fat pointer ideas in CHERI exposed only an offset rather than an address, and it turns out that de facto C really hates that. We now expose the address and give up on the ability to do copying GC (my original hope was that the compiler could flag the few places where addresses escaped and needed to be stable, but this turned out to be far too many places).
If the GC model supports some stable and unique notion of object identity, then it might work. I wrote an Objective-C to Dart compiler many years ago that used a monotonic counter for any object whose address was converted to an integer (including pointer comparison between distinct objects). This added some overhead, but gave a GC’d C and Objective-C environment (Objective-C objects were objects with ivars as Dart objects, C structures were just byte arrays). I never bothered measuring performance because I knew it would be slow, I just wanted to see if it could be done at all.

In conversation about 7 days ago from infosec.exchange permalink
Embed this notice
David Chisnall (*Now with 50% more sarcasm!*) (david_chisnall@infosec.exchange)'s status on Sunday, 27-Jul-2025 04:02:02 JST David Chisnall (*Now with 50% more sarcasm!*)
in reply to
@whitequark @dizzy @BlueBee
We tried that in early CHERI systems. Having two kind of pointer is awful for C. Porting code to work like that was a huge pain, maintaining it was even harder. Eventually we threw it away and told everyone not to use it because the developer experience was universally terrible.
For comparison, porting tcpdump to use capabilities for packet buffers (which is step one in being able to separate out the parsers into compartments, but not including that work) was a 1,700 line diff. Making the whole thing memory safe with every pointer a capability was a 3-line diff. And tcpdump is not a large program.

In conversation about 7 days ago from infosec.exchange permalink
Embed this notice
David Chisnall (*Now with 50% more sarcasm!*) (david_chisnall@infosec.exchange)'s status on Sunday, 27-Jul-2025 03:52:34 JST David Chisnall (*Now with 50% more sarcasm!*)
in reply to
@whitequark @dizzy @BlueBee
The other problem with Java security was that it was a case study in defence in breadth. It depended on 100% correct implementation of everything in Java. The SecurityManager controlled privilege and required some stack traversal to even identify the correct security context. Privileges were identified by strings, so immutability of strings was part of the TCB for sandboxing and if you could create a bit flip in a string you could escape.
The problem with WebAssembly is that they completely failed to learn the lesson from the last 20 or so years of software compartmentalisation: isolation is easy, (safe) sharing is hard. And so they built a system entirely for isolation. Safe sharing requires memory safety, but they decided to represent pointers as integers. And they did it two years after we’d demonstrated that you large C and C++ codebases do not rely on that conflation and it’s easy to port most things to a memory-safe implementation. And, because it’s in the browser, it’s sucking the air out of any attempt to solve the problem properly.

In conversation about 7 days ago from infosec.exchange permalink
Embed this notice
David Chisnall (*Now with 50% more sarcasm!*) (david_chisnall@infosec.exchange)'s status on Saturday, 26-Jul-2025 20:41:57 JST David Chisnall (*Now with 50% more sarcasm!*)
in reply to
@knud @tante @Ruth_Mottram
There are a few causes. Ed and @davidgerard have both written about the need for growth in the tech industry. If your stock price is going up, you can pay 50%+ of employees' salaries by issuing new stock, which means you can pay a lot more than your revenue would suggest.
The root problem for tech companies is that user requirements grow at human rates. Bill Gates had the ambition to have a computer on every desk. That's a market of a few hundred million people who can afford a PC (possibly more now) and so you can grow from zero to that over a decade or two and have really big growth rates.
I've written about why this is a problem for cloud companies before, so I'll try to be brief:
Basically, the requirements for cloud customers grow by maybe 5-20% per year, but the cost of the compute to provide them halves roughly every 12-24 months. This means that you have a rapid growth phase as people move from on-premises to the cloud, but then the amount that people need goes down every year. That's a problem because it means that there's a cliff for cloud growth: as soon as all of the big companies have moved onto the cloud, you start losing. Their cloud requirements will grow more slowly than your costs go down (and you have to reduce your prices in line with those costs or you'll see people move out of the cloud as they realise that they're paying a lot but really only need one rack of machines).
AI is their solution to this. It is a new use case that requires far more compute than anyone can afford on premises and guarantees growth for the cloud for a while.
That means that all of the big cloud companies have a huge incentive to tell everyone else that AI is the future. Some of this is things that I'm quite surprised the SEC is okay with, such as investing in startups with no real business model solely so that they will buy your services and let you claim that you have customers.
At the start of the bubble, over $1T of value in the stock market was predicated on the assumption that cloud sales would increase in a way that they absolutely could not without a Next Big Thing™ to drive compute demand. If they hadn't found something like AI, most of that money would be wiped out. That's a huge incentive for a lot of people to push AI into all of the things.
In conversation about 8 days ago from infosec.exchange permalink
Attachments
1. No result found on File_thumbnail lookup.
  
  Domain Details Page
Embed this notice
David Chisnall (*Now with 50% more sarcasm!*) (david_chisnall@infosec.exchange)'s status on Saturday, 26-Jul-2025 16:41:41 JST David Chisnall (*Now with 50% more sarcasm!*)
in reply to
- jcoglan
@jcoglan I am too young to remember the phone phreaking attacks that made everyone learn that in-band signalling is a bad idea. I am old enough to remember the ping-of-death attacks that reminded people. LLMs will remind an entire new generation.

In conversation about 8 days ago from infosec.exchange permalink
Embed this notice
David Chisnall (*Now with 50% more sarcasm!*) (david_chisnall@infosec.exchange)'s status on Friday, 25-Jul-2025 23:39:56 JST David Chisnall (*Now with 50% more sarcasm!*)

I really hope when Trump arrives in Scotland they ask to see his phone and all of his social media posts and then deny him entry because he's clearly a national security risk based on his random threats to invade other countries.

In conversation about 9 days ago from infosec.exchange permalink
Embed this notice
David Chisnall (*Now with 50% more sarcasm!*) (david_chisnall@infosec.exchange)'s status on Friday, 25-Jul-2025 16:38:02 JST David Chisnall (*Now with 50% more sarcasm!*)
in reply to
- ✧✦Catherine✦✧
- CyberFrog
@whitequark @froge
Well, that’s horrifying. There are very few crypto implementations I’d trust and one advertising itself as ‘pure {high-level language}’ would be immediately discarded unless that was bullet 3 with ‘formally verified’ as bullet 1. It reminds me of lat ‘90s Java, where ‘Pure Java’ was the reason for a lot of poor-quality reimplementations of heavily tested libraries.

In conversation about 9 days ago from gnusocial.jp permalink
Embed this notice
David Chisnall (*Now with 50% more sarcasm!*) (david_chisnall@infosec.exchange)'s status on Friday, 25-Jul-2025 16:06:32 JST David Chisnall (*Now with 50% more sarcasm!*)
in reply to
- CyberFrog
@froge
For someone familiar with the Rust ecosystem: is this a thing anyone is likely to actually use? I can’t imagine a stand-alone RSA implementation being a not-a-toy thing in other languages.

In conversation about 9 days ago from infosec.exchange permalink
Embed this notice
David Chisnall (*Now with 50% more sarcasm!*) (david_chisnall@infosec.exchange)'s status on Friday, 25-Jul-2025 03:55:57 JST David Chisnall (*Now with 50% more sarcasm!*)
- cliffle
- ✧✦Catherine✦✧
@cliffle @whitequark
It’s a problem in part from trying to be a multi-paradigm language, and partly simply from being old. The first ISO C++ standard was in 1998 (the language is over a decade older) and a lot has changed about how we thing about large-scale software engineering (and the definition of large scale has grown by a couple of orders of magnitude) since then. We’ve learned a lot about type system design. Modern hardware is very different from the computers in 1985 for which C++ was originally designed, and quite a lot different from the 1998 targets of ISO C++.
In the same time, Rust has gone from not existing (most of that time) to being a garbage-collected language, to being a language built around a single ownership model. When Rust is 40 years old, I expect it will be barely recognisable today and there will be lots of Rust developers complaining about people not using the exciting Rust 3.0 features and others complaining about people writing code that doesn’t build with Rust 1.9 (which they will refer to as ‘the last good version’).
Rust 1.0 is now ten years old. C++ reached that age in 1995. And 1995’s C++ was a pretty good language for the computers and problems of 1995.

In conversation about 9 days ago from infosec.exchange permalink
Embed this notice
David Chisnall (*Now with 50% more sarcasm!*) (david_chisnall@infosec.exchange)'s status on Wednesday, 23-Jul-2025 04:10:54 JST David Chisnall (*Now with 50% more sarcasm!*)
in reply to
@argv_minus_one @whitequark @matt
Aside from primitives, every object is allocated from the GC heap, but the degree to which this implies suffering is often overstated. Most objects that would have been stack allocations never escape the young generation and so GC adds very small overhead for them. Stack and GC allocations cost the same amount, only the deallocation costs more in the GC heap. In some implementations, the compiler can do escape analysis and either move non-escaping allocations to the stack or mark them in GC as trivially dead and not needing scanning.

In conversation about 11 days ago from infosec.exchange permalink
Embed this notice
David Chisnall (*Now with 50% more sarcasm!*) (david_chisnall@infosec.exchange)'s status on Tuesday, 22-Jul-2025 01:15:38 JST David Chisnall (*Now with 50% more sarcasm!*)
in reply to
- ✧✦Catherine✦✧
- tigerhiddenadam
@whitequark @tigerhiddenadam
If we can, the MPact simulator would be the best one to run in WAsm because it's much faster than the Sail and it also has an integrated debugger, so is more interesting for a live development environment.

In conversation about 13 days ago from gnusocial.jp permalink
Embed this notice
David Chisnall (*Now with 50% more sarcasm!*) (david_chisnall@infosec.exchange)'s status on Tuesday, 22-Jul-2025 00:28:03 JST David Chisnall (*Now with 50% more sarcasm!*)

The most annoying thing about our industry is that we go through cycles where we start with a really good idea, but it isn't quite feasible on current computers. Then someone comes along with a less-good version that's actually possible to ship in the mass market. Then an entire generation of programmers grows up with the simplified version, running on computers a thousand times more powerful, and at least a hundred times more powerful than the richer system required. But they have never seen the more complex version and believe that limitations inherited from the simplified version are intrinsic to the problem being solved. And then they come up with something layered on top that repeats the cycle.

In conversation about 13 days ago from infosec.exchange permalink
Embed this notice
David Chisnall (*Now with 50% more sarcasm!*) (david_chisnall@infosec.exchange)'s status on Tuesday, 22-Jul-2025 00:28:01 JST David Chisnall (*Now with 50% more sarcasm!*)
in reply to
- ✧✦Catherine✦✧
- Matt Campbell
@whitequark @matt
In 2005, I saw a talk but Alan Kay. His big reveal in the middle was that the talk was using Squeak (bytecode-interpreted Smalltalk) and not PowerPoint for his slides. He showed this off by dropping into an inspector and doing some live coding in the middle.
But the slide a couple before that reveal had contained full-motion video (which was still pretty rare in slides back then). The video had been an MPEG-1 video (so not the latest CODEC: MPEG-2 was feasible to decode on the CPU then, MPEG-4 non-AVC was with an optimised implementation). The CODEC was, itself, written in Smtalltalk.
Computers are ludicrously fast now. Even the 'slow' Java implementations from the late '90s were an order of magnitude faster than CPython and not that slow on modern hardware. A modern optimising JIT gains you another order of magnitude or so.
CHERI's capability model is not quite the shape of hardware capability systems from the '60s (different things got faster at different rates, now compute is almost free but non-local memory accesses are very expensive, whereas the converse was true back then), but the entire field was discarded for 20-30 years because RISC showed that you could make simpler computers fast and do things in software on a fast-and-simple core that outperformed doing them in a more complex implementation. Right up until you start to build complex out-of-order pipelines, at which point you realise that you have a lot of fixed overhead per instruction and doing more work per instruction is where the big performance wins come from.

In conversation about 13 days ago from infosec.exchange permalink
Embed this notice
David Chisnall (*Now with 50% more sarcasm!*) (david_chisnall@infosec.exchange)'s status on Monday, 21-Jul-2025 22:46:50 JST David Chisnall (*Now with 50% more sarcasm!*)
Huh, so apparently something I thought was obvious about parsing and lexing was not. Perhaps because most of the languages I've worked on have had at least some context-specific keywords, whereas most toy languages and languages that were designed without later aggregating features do not have this property.
I always build front ends the opposite way around to how lex / yacc work. In this model (which I think of as 'push'), the lexer drives the parser. It identifies a token, then tells the parser 'I have a token of kind X, please handle it'. This works really badly for languages with context-dependent keywords. For example, in Objective-C, the token atomic may be a keyword if it's in a declared-property declaration or an identifier if it's anywhere else (including in some places in a declared-property declaration). The lexer doesn't know which it is, so you need to either:
- Have the lexer always treat atomic as an identifier and then do some re-lexing in the parser to say 'ah, you have an identifier, but it's this specific identifier, so it's actually a keyword.
- Replace everything else that uses an identifier with 'identifier or one of these things that are keywords elsewhere'.
The thing you want is to have (at least) two notions of an identifier (any identifier, or identifier-but-not-that-kind-of-identifier) in the lexer, but the lexer can't do this because lexing must be unambiguous in the push model.
In the pull model, the parser is in charge. It asks the lexer for the next token, and may ask it for a token of a specific kind, or a specific set of kinds. The parser knows the set of things that may happen next. If you're somewhere that has context-specific keywords, ask the lexer for them first, and if it doesn't have one ask it for an identifier. Now you have explicit precedence in the parser that disambiguates things in the lexer and avoids introducing complexity in the token definitions. You may also have simpler regexes in the lexer, because now you can specialise for the set of valid tokens at a specific point. If you know you need a comma or a close-parenthesis after you've parsed a function argument, you can ask for precisely that set of valid tokens, which compiles down to under five instructions on most architectures, rather than the full state machine that can parse any token.
Even without any performance benefits, it's just a much nicer way of writing a parser. Yet the other way around seems to still be taught and explained as if it's a sensible thing to do.
In conversation about 13 days ago from infosec.exchange permalink
Attachments
1. Untitled attachment
Embed this notice
David Chisnall (*Now with 50% more sarcasm!*) (david_chisnall@infosec.exchange)'s status on Monday, 21-Jul-2025 21:21:46 JST David Chisnall (*Now with 50% more sarcasm!*)
in reply to
- ✧✦Catherine✦✧
- tigerhiddenadam
@whitequark
@tigerhiddenadam has been trying to get one of the CHERIoT simulators working in a browser so that we can have a complete demo environment working on the web site…

In conversation about 13 days ago from infosec.exchange permalink

Before

Public

Notices by David Chisnall (Now with 50% more sarcasm!) (david_chisnall@infosec.exchange)

User actions

Following 0

Followers 0

Groups 0

Statistics

Feeds

Public

Notices by David Chisnall (*Now with 50% more sarcasm!*) (david_chisnall@infosec.exchange)

User actions

Following 0

Followers 0

Groups 0

Statistics

Feeds

Notices by David Chisnall (Now with 50% more sarcasm!) (david_chisnall@infosec.exchange)