if you're looking at the lzma thing and trying to figure out if you should be concerned, and if you can do anything about it:
the answers are definitely yes, and probably not much, respectively
this is one of those 'off the charts' sorts of scenarios, because the impact isn't just the vulnerability itself (a remote ssh backdoor on some systems), it's that it was seemingly inserted intentionally into this library which exists on every linux distro by one of the maintainers of the library, in signed commits, with very thorough attempts to obfuscate it, and with what appears to be active efforts to mask side effects when they were noticed.
so even if your system did not fit the criteria that we believe are necessary to trigger that backdoor and/or you have reverted to an older version that didn't have the final piece, you are still running code written by the person who intentionally added that backdoor.
@linear Right, and sadly it's pretty much a core dependency of most distros (like on a BSD it would be in base) so that's going to be a hell of a moment to figure out wtf to do in terms of "now, what?".
@linear it's worse than that. This backdoor showed that ot's possible, and it was caught by sheer luck. How many more are there in other projects, added by different people, that we're not aware of?
@wolf480pl@linear Well one of the things could be to push for autotools to be replaced: - having code undistinguishable from obfuscation means no code review and makes it much harder to patch - forces the use of generated tarballs that doesn't corresponds to what's in version control - autoreconf fails way too frequently at regenerating ./configure files
@wolf480pl@linear Yeah and having the tools to do so, pkgdiff being one of them. But I know some packagers always review diffs and are quite expected to do so, but they need to be able to review everything and we're *really bad* at allowing this, a ton of projects just ship binaries in the source tarballs.
@wolf480pl@linear To put comments? Interesting at first but I'd say it's actually an awful idea as it could be a strong way of making malware seem legit. Because reviewers, specially casual ones like packagers which aren't going to go in careful auditing mode, would end up trusting the comments over the actual code.
I think it would make more sense to verify (say with sandboxing or separated repo for the fixtures) that the test fixtures aren't used in production binaries. Which could allow to have well-known fixtures, quite like the PNG ones, that could be carefully reviewed once and not need modifications for many years.
@lanodan@linear also, speaking of not keeping blobs in source code:
If the test files are binary, I think they shilould be generated at build time, eg. with nasm or some script, so that it's clear why specific bytes go in specific places.
@lanodan@linear I was thinking more like explicit field sizes (writeI32, writeU8, etc), calculated fields, maybe loops, to reduce the overall entropy and hopefully use nothing-up-my-sleeve numbers.
But you're right, there's a high chance magic numbers will appear anyway, and if we accept those, then it's safer to have them as a blob than surrounded with misleading comments / variable names /etc.
Also, I'm not saying the test vectors should change / shouldn't be standardized. You coud keep in the repo sha256sums of known-good test vectors, and check the generated ones against the sums.