If there were another binary backdoor similar to the xz attack that was found today... how would you find it?
(The xz attack was found by chance and some trivial issues that caused performance degradation)
If there were another binary backdoor similar to the xz attack that was found today... how would you find it?
(The xz attack was found by chance and some trivial issues that caused performance degradation)
@dfeldman nice tabletop exercise
additionally: going forward, now that everyone knows this one worked (briefly), how common should we expect attacks of this nature to be? what portion of them should we expect to detect, and at what stage in their lifecycles?
@dfeldman One measure we should introduce: distro package maintainers should fetch both git and release tarballs to compare before accepting a new version. Release tarballs not matching actually-reviewed project history are a huge gratuitous threat vector.
@dfeldman All (?) distributions build everything from source. That's what a (legitimate) distribution is.
Moreover, reproducibility has nothing to do with my proposal. You're not testing that your binary matches somebody else's. You're testing that the source release tarball actually came from the vetted project history.
@dalias while that would make sense, most distributions cannot build everything from source
and even if they could, builds are usually not reproducible -- there will be small differences anyway that will cause false positives
@dalias as one of the primary maintainers of the ~260 X.Org packages, I think our existing plan to port them to meson so the tarball contents are the git checkout without generated files is going to be easier than modifying our processes to check in the autoconf output (which we'd probably want to do in the CI pipeline so it doesn't keep changing depending on the distro/version used by each contributor).
@alanc At least if they don't want autoconf output in the main commit timeline, it should be added as a final branched commit before release tag so the release tarball matches the repo rather than having undocumented changes in it.
@dalias autoconf complicates this by putting generated files in release tarballs that aren’t in git, and which won’t match the versions you generate locally unless you use the exact same versions of autoconf, auto make, libtool, and all associated macros, as the xz hack made clear. Other build systems (like meson) that don’t need you to distribute files that aren’t in git will simplify this.
@alanc Lots of projects commit output of autoconf, which has its own issues of course, but I think this is a data point in support of that practice...
@rst @leftpaddotpy @raito @dalias except that won't work as there will be differences in the generated files unless you have the exact same versions of autoconf, automake, libtool, and every package that delivers its own m4 macros into /usr/share/aclocal.
@leftpaddotpy @raito @alanc @dalias Easiest way to do that is to check whether it actually *was* generated by autoconf, by re-running autoconf from the checked-in config files, and seeing if you get something identical to what's in the tarball (modulo embedded timestamps and the like). Which would mean that what you got corresponded to checked-in source... but, say, custom tests in that are obscure enough to provide an alternate route for trying to sneak something through.
@raito @alanc @dalias wonder if the solution here is to construct things to evaluate whether an autoconf script is one that could have been generated by any released version of autoconf and check the maintainers' work, so we could find out if there's malicious stuff going on (even if distros just ignore the release tarball anyway)
@raito @alanc @dalias i would absolutely believe *autoconf* files to be a vector for malicious code, they're incomprehensible macro noise by nature, and this is just speaking as a nixos maintainer for whom these files are simply constantly broken and should not be used regardless of malice
tbh my view is that release tarballs that aren't simply the git state are a practice that should be abolished. or at least we should diff the heck out of them and figure out how to catch malicious autoconf.
@alanc @dalias I'd imagine it'd be reasonable to modulo those generated files like the version / hash rev or would you believe more sophisticated executable generated file would be present?
@leftpaddotpy @raito @alanc In my book, changes between git and release are history, and there should not be undocumented history.
@rogersm @josephholsten @alanc Shell is not really that hard to write to or to implement. You just go by the specification, not what you imagine the specification says.
@josephholsten @rogersm @alanc @dalias posix shells are terrifying indeed.
For autoconf replacement, we cannot expect to be 100% compatible, just good enough to migrate from the current mess.
@rogersm @alanc @dalias I’ve taken over maintenance of more than a few projects because I care about keeping existing systems running.
But autotools is one of those things that sounds horrible to rework while backwards compatible.
Another thing that terrifies me is posix shell, and the smoosh research in to a correct parser shows how painful back compat can be: http://shell.cs.pomona.edu
@josephholsten @dalias and that check fails in the autoconf world unless the files are generated by the build automation, as they'll differ depending on the build environment used to generate them. (Yes, autoconf is not the system we'd design today, it's one that was designed decades ago for a different world.)
@alanc @josephholsten @dalias there are only two possibilities: either we rewrite autoconf or we remove it.
Both are difficult, but the good news is that are possible (something impossible before the “Linux wave” we’re still riding)
But I don’t know what is easier.
@alanc @dalias Oh, I forgot the most important point: build automation attempts to regenerate the artifacts and fails if they aren’t identical. Because then the generated work acts as a test case for the generator as well.
@alanc @dalias Since doing golang work, I’ve started committing the generated artifacts. Yes, it’s wasteful of storage and dev attention when everything works, but I’ve had too many surprises from diffs due to generator tooling change.
I haven’t gotten so bad as to vendor all dependencies in repo, but some days I think hard about it.
@leftpaddotpy @alanc @rst @raito Or put them in the git repo with the commit documenting exactly what versions of each thing they were built with.
@alanc @rst @raito @dalias yeah. which means really we have a responsibility to either make it possible to get those exact versions via docker or nix or so, or we need to abolish putting autoconf files in tarballs
@rst @dalias @leftpaddotpy @raito if you don’t trust the maintainer, then you simply cannot use the software, whether it uses autotools or not, but generating the files doesn’t stop the maintainer from putting malicious stuff in configure.ac, Makefile.am, or one of the .m4 files that generates the shell scripts and commands run to build.
@dalias @leftpaddotpy @alanc @raito But something would still have to fetch those versions or re-run them, or someone malicious, along the lines of "Hans Jansen"/"Jia Tan", could commit a malicious script decorated with plausible lies about how it was produced. Perhaps easiest to just have the build hosts run autotools themselves, and ignore any purported build artifacts that happen to be present.
@ross @timbray @alanc @josephholsten Sure it is. There are plenty of sh interpreters that run on Windows, and modern versions of Windows include bash etc. as part of WSL. There are a lot more systems that can run a conforming sh than that can run a Python interpreter.
@dalias @timbray @alanc @josephholsten not portable to windows…
@ross @timbray @alanc @josephholsten sh is 100% portable. Whether it's terrible... maybe? But it really depends on what you're doing with it. For saving variables and running some probing commands, it's really not particularly bad.
@dalias @timbray @alanc @josephholsten oh I guess it needs Python, which could be a deal breaker. But, cmon, we want something not terrible and sh is both terrible and non portable
@timbray @alanc @josephholsten I loathe the autoconf implementation but love the interface (for the user who obtains the software, not the developers using autoconf).
The problem with all the replacements is they're awful for the user (broken cross compiling, impossible to inject custom compilers or cflags, broken dependency search with no way to override, require preinstalled tooling in multiple languages, etc.)
Leaving the autoconf world is starting to sound awfully attractive in 2024. Mind you, Im prejudiced, I’ve loathed GNU AutoHell since 1998 or thereabouts.
@timbray @alanc @josephholsten A good replacement would still be ./configure && make - without the m4 shit, broken probe macros, etc., but with all the standard vars, --enable-*, etc. users need.
@timbray @alanc @josephholsten A good autoconf replacement would have the actual executable script "configure" file be immutable and data driven, so the big complex logic is known to be non-malicious just by matching upstream hash & local behavior is in human-comprehensible data.
It would support only collection of building-user's preferences, dep search, and compile/link checks using selected tools - not executing arbitrary code at config time.
GNU social JP is a social network, courtesy of GNU social JP管理人. It runs on GNU social, version 2.0.2-dev, available under the GNU Affero General Public License.
All GNU social JP content and data are available under the Creative Commons Attribution 3.0 license.