Conversation

Notices

Embed this notice
Matthew Garrett (mjg59@nondeterministic.computer)'s status on Thursday, 02-Nov-2023 14:18:37 JST Matthew Garrett

"Why does ACPI exist" in the beforetimes power management on x86 was done by jumping to an opaque BIOS entry point and hoping it would do the right thing. It frequently didn't. Failed to program your graphics card exactly the way the BIOS expected? Hurrah! Data corruption for you. ACPI made the reasonable decision that, well, maybe it should be up to the OS to set state and be able to recover it. But how should the OS deal with state that's fundamentally device specific?

In conversation Thursday, 02-Nov-2023 14:18:37 JST from nondeterministic.computer permalink
- clacke likes this.
- Embed this notice
  Matthew Garrett (mjg59@nondeterministic.computer)'s status on Thursday, 02-Nov-2023 14:18:41 JST Matthew Garrett
  in reply to
  
  One way to do that would be to have the OS know about the device specific details. Unfortunately that means you can't ship the computer until you modify every single OS you want to support and get new releases out there. This, uh, was not an option the PC industry seriously considered. The alternative is that you ship something that abstracts the details of the specific hardware. This is what ACPI does, and it's also what things like Device Tree do.
  
  In conversation Thursday, 02-Nov-2023 14:18:41 JST permalink
  
  clacke likes this.
- Embed this notice
  Matthew Garrett (mjg59@nondeterministic.computer)'s status on Thursday, 02-Nov-2023 14:18:46 JST Matthew Garrett
  in reply to
  
  The main distinction between Device Tree and ACPI is that Device Tree is purely a description of the hardware that exists, and so still requires the OS to know what's possible - if you add a new type of power controller, for instance, you need to add a driver for that to the OS before you can express that via Device Tree. ACPI decided to include an interpreted language to allow vendors to expose functionality to the OS without the OS needing to know about the underlying hardware.
  
  In conversation Thursday, 02-Nov-2023 14:18:46 JST permalink
  
  clacke likes this.
- Embed this notice
  Matthew Garrett (mjg59@nondeterministic.computer)'s status on Thursday, 02-Nov-2023 14:18:50 JST Matthew Garrett
  in reply to
  
  So, for instance, ACPI allows you to associate a device with function to power down that device. That function may, when executed, trigger a bunch of register accesses to a piece of hardware otherwise not exposed to the OS, and that hardware may then cut the power rail to the device to power it down entirely. And that can be done without the OS having to know anything about the control hardware.
  
  In conversation Thursday, 02-Nov-2023 14:18:50 JST permalink
  
  clacke likes this.
- Embed this notice
  Matthew Garrett (mjg59@nondeterministic.computer)'s status on Thursday, 02-Nov-2023 14:19:05 JST Matthew Garrett
  in reply to
  
  How is this better than just calling into the firmware to do it? Because the fact that ACPI declares that it's going to access these registers means the OS can figure out that it shouldn't, because it might otherwise collide with what the firmware is doing. With APM we had no visibility into that - if the OS tried to touch the hardware at the same time APM did, boom, almost impossible to debug failures
  
  In conversation Thursday, 02-Nov-2023 14:19:05 JST permalink
  
  clacke likes this.
- Embed this notice
  Matthew Garrett (mjg59@nondeterministic.computer)'s status on Thursday, 02-Nov-2023 14:19:14 JST Matthew Garrett
  in reply to
  
  (This is why various hardware monitoring drivers refuse to load by default on Linux - the firmware declares that it's going to touch those registers itself, so Linux decides not to in order to avoid race conditions and potential hardware damage. In many cases the firmware offers a collaborative interface to obtain the same data, and a driver can be written to get that (https://bugzilla.kernel.org/show_bug.cgi?id=204807#c37 discusses this for a specific board))
  In conversation Thursday, 02-Nov-2023 14:19:14 JST permalink
  Attachments
  1. No result found on File_thumbnail lookup.
    
    204807 – Hardware monitoring sensor nct6798d doesn't work unless acpi_enforce_resources=lax is enabled
  clacke likes this.
- Embed this notice
  Tobias Klausmann (klausman@mas.to)'s status on Thursday, 02-Nov-2023 14:19:28 JST Tobias Klausmann
  in reply to
  
  @mjg59 Not to mention that many ACPI issues were/are caused by the motherboard-side implementation of ACPI being buggy, incomplete or self-contradictory. One thing that ACPI decidedly isn't: simple.
  
  In conversation Thursday, 02-Nov-2023 14:19:28 JST permalink
  
  clacke likes this.
- Embed this notice
  Matthew Garrett (mjg59@nondeterministic.computer)'s status on Thursday, 02-Nov-2023 14:19:31 JST Matthew Garrett
  in reply to
  
  Historically there were a bunch of ACPI-related issues because the spec didn't define every single possible scenario and also there was no conformance suite (eg, should the interpreter be multi-threaded? Not defined by spec, but influences whether a specific implementation will work or not!). These days overall compatibility is pretty solid and the vast majority of systems work just fine, but we do still have some issues that are largely associated with System Management Mode.
  
  In conversation Thursday, 02-Nov-2023 14:19:31 JST permalink
- Embed this notice
  Matthew Garrett (mjg59@nondeterministic.computer)'s status on Thursday, 02-Nov-2023 14:19:37 JST Matthew Garrett
  in reply to
  
  Unfortunately ACPI doesn't entirely remove opaque firmware from the equation - ACPI methods can still trigger System Management Mode, which is basically a fancy way to say "Your computer stops running your OS, does something else for a while, and you have no idea what". This has all the same issues that APM did, in that if the hardware isn't in exactly the state the firmware expects, bad things can happen.
  
  In conversation Thursday, 02-Nov-2023 14:19:37 JST permalink
- Embed this notice
  James Henstridge (jamesh@aus.social)'s status on Thursday, 02-Nov-2023 14:19:38 JST James Henstridge
  in reply to
  - Tobias Klausmann
  @klausman @mjg59 The real test is whether it makes the system simpler over all. And I'd argue the one-kernel-device model seen on Android phones is complicated in a different way, even if each individual kernel might be simpler.
  
  In conversation Thursday, 02-Nov-2023 14:19:38 JST permalink
- Embed this notice
  Tobias Klausmann (klausman@mas.to)'s status on Thursday, 02-Nov-2023 14:19:38 JST Tobias Klausmann
  in reply to
  - James Henstridge
  @jamesh @mjg59 I agree. But then again, embedded devices,.phones and desktop computers and data centre servers all have different parameters and benefit from different approaches.
  Just as long as we don't go back to SET BLASTER="A220 I5 D1"
  
  In conversation Thursday, 02-Nov-2023 14:19:38 JST permalink
  
  clacke likes this.
- Embed this notice
  James Henstridge (jamesh@aus.social)'s status on Thursday, 02-Nov-2023 14:19:41 JST James Henstridge
  in reply to
  - Tobias Klausmann
  @klausman @mjg59 every smartphone I've owned so far has stopped receiving OS version upgrades before it became unusable.
  In contrast, I've got a 10+ year old x86 server in my closet running a recent Linux distro. It just works because no one has to do hardware enablement for that specific system in the new OS release.
  In conversation Thursday, 02-Nov-2023 14:19:41 JST permalink
  Attachments
  1. Untitled attachment
  clacke likes this.
- Embed this notice
  Matthew Garrett (mjg59@nondeterministic.computer)'s status on Thursday, 02-Nov-2023 14:19:45 JST Matthew Garrett
  in reply to
  - Howard Chu @ Symas
  - Tobias Klausmann
  @hyc @klausman And the answer is just to claim to be Windows, because Windows has an established contract with the firmware in a way that Linux never has
  
  In conversation Thursday, 02-Nov-2023 14:19:45 JST permalink
  
  clacke likes this.
- Embed this notice
  Howard Chu @ Symas (hyc@mastodon.social)'s status on Thursday, 02-Nov-2023 14:19:46 JST Howard Chu @ Symas
  in reply to
  - Tobias Klausmann
  @klausman @mjg59 and there were still plenty of issues where the ACPI expects the OS to be some flavor of Windows, e.g. "if win95 do X elseif win2k do Y" and does nothing on Linux, so some feature just doesn't work. Typically we'd fix these by dumping the DSDT and rewriting it, but nowadays dynamically loadable DSDTs are deprecated even though those types of problems are just as prevalent as ever.
  
  In conversation Thursday, 02-Nov-2023 14:19:46 JST permalink
- Embed this notice
  Howard Chu @ Symas (hyc@mastodon.social)'s status on Thursday, 02-Nov-2023 14:19:50 JST Howard Chu @ Symas
  in reply to
  - Tobias Klausmann
  @mjg59 @klausman That's the stock answer but it's inadequate. You have to know to claim to be a specific version of Windows, otherwise you still get breakage.
  
  In conversation Thursday, 02-Nov-2023 14:19:50 JST permalink
- Embed this notice
  Jan Lehnardt :couchdb: (janl@narrativ.es)'s status on Thursday, 02-Nov-2023 14:19:50 JST Jan Lehnardt :couchdb:
  in reply to
  - Howard Chu @ Symas
  - Tobias Klausmann
  @hyc @mjg59 @klausman web guy here points wildely at user agent strings to show where that path leads ;D
  
  In conversation Thursday, 02-Nov-2023 14:19:50 JST permalink
  
  clacke likes this.
- Embed this notice
  Matthew Garrett (mjg59@nondeterministic.computer)'s status on Thursday, 02-Nov-2023 14:19:58 JST Matthew Garrett
  in reply to
  - Howard Chu @ Symas
  - Tobias Klausmann
  @hyc @klausman You claim to be whatever is the latest version of Windows whose behaviour you've attempted to model
  
  In conversation Thursday, 02-Nov-2023 14:19:58 JST permalink
  
  clacke likes this.
- Embed this notice
  Matthew Garrett (mjg59@nondeterministic.computer)'s status on Thursday, 02-Nov-2023 14:20:02 JST Matthew Garrett
  in reply to
  
  By and large ACPI has been a net improvement in Linux compatibility on x86 systems. It certainly didn't remove the "Everything is Windows" mentality that many vendors have, but it meant we largely only needed to ensure that Linux behaved the same way as Windows in a finite number of ways rather than in every single hardware driver, and so the chances that a new machine will work out of the box are much greater than they were in the pre-ACPI period
  
  In conversation Thursday, 02-Nov-2023 14:20:02 JST permalink
  
  clacke likes this.
- Embed this notice
  Matthew Garrett (mjg59@nondeterministic.computer)'s status on Thursday, 02-Nov-2023 14:20:05 JST Matthew Garrett
  in reply to
  
  This isn't something that ACPI enabled - in the absence of ACPI firmware vendors would just be doing this unilaterally with even less OS involvement and we'd probably have even more of these issues. Ideally we'd "simply" have hardware that didn't support transitioning back to opaque code, but we don't (ARM has basically the same issue with TrustZone)
  
  In conversation Thursday, 02-Nov-2023 14:20:05 JST permalink
  
  clacke repeated this.
- Embed this notice
  Matthew Garrett (mjg59@nondeterministic.computer)'s status on Thursday, 02-Nov-2023 14:20:07 JST Matthew Garrett
  in reply to
  
  One example is a recent Lenovo one, where the firmware appears to try to poke the NVME drive on resume. There's some indication that this is intended to deal with transparently unlocking self-encrypting drives on resume, but it seems to do so without taking IOMMU configuration into account and so things explode. It's kind of understandable why a vendor would implement something like this, but it's also kind of understandable that doing so without OS cooperation may end badly.
  
  In conversation Thursday, 02-Nov-2023 14:20:07 JST permalink
- Embed this notice
  irenes (irenes@mastodon.social)'s status on Thursday, 02-Nov-2023 14:20:10 JST irenes
  in reply to
  
  @mjg59 thanks for that, it's difficult to know what to think about the relative merits of it without this kind of experience
  
  In conversation Thursday, 02-Nov-2023 14:20:10 JST permalink
  
  clacke likes this.
- Embed this notice
  Matthew Garrett (mjg59@nondeterministic.computer)'s status on Thursday, 02-Nov-2023 14:20:24 JST Matthew Garrett
  in reply to
  - Neversphere
  @neversphere Honestly version 1 of the spec? It's a reasonable size compared to later versions and covers most of what's still relevant today
  
  In conversation Thursday, 02-Nov-2023 14:20:24 JST permalink
  
  clacke likes this.
- Embed this notice
  Neversphere (neversphere@ioc.exchange)'s status on Thursday, 02-Nov-2023 14:20:29 JST Neversphere
  in reply to
  
  @mjg59 is there a decent primer on ACPI?
  
  In conversation Thursday, 02-Nov-2023 14:20:29 JST permalink
- Embed this notice
  Matthew Garrett (mjg59@nondeterministic.computer)'s status on Thursday, 02-Nov-2023 14:20:35 JST Matthew Garrett
  in reply to
  
  (The alternative of teaching the kernel about every piece of hardware it should run on? We've seen that in the ARM world. Most code simply never reaches mainline, and most users are stuck running ancient kernels as a result. Imagine every x86 device vendor shipping their own kernel optimised for their hardware, and now imagine how well that works out given the quality of their firmware. Does that really seem better to you?)
  
  In conversation Thursday, 02-Nov-2023 14:20:35 JST permalink
- Embed this notice
  Matthew Garrett (mjg59@nondeterministic.computer)'s status on Thursday, 02-Nov-2023 14:20:41 JST Matthew Garrett
  in reply to
  - NCommander
  @ncommander tbf SBSA does do that in the server space (and for exactly this reason)
  
  In conversation Thursday, 02-Nov-2023 14:20:41 JST permalink
  
  clacke likes this.
- Embed this notice
  NCommander (ncommander@social.restless.systems)'s status on Thursday, 02-Nov-2023 14:20:42 JST NCommander
  in reply to
  
  @mjg59
  What frustrates me is the sheer number of vendors that didn't see this as a problem (or saw it as a feature), combined with ARM refusing to try and even address this problem when AArch64 is released.
  It's difficult to say that ARM is a viable competitor in non-embedded spaces when you can't say that you'll be able to install an update six months down the line (and this is sadly not even a hypothetical ...)
  In conversation Thursday, 02-Nov-2023 14:20:42 JST permalink
  Attachments
  1. No result found on File_thumbnail lookup.
    
    released.it
    
    This domain may be for sale!
- Embed this notice
  Josh Triplett (josh@social.joshtriplett.org)'s status on Thursday, 02-Nov-2023 14:20:46 JST Josh Triplett
  in reply to
  - G
  That seems like something that could have been standardized through data rather than code, though. For instance, a standard interface for power supply hardware, with enumerable power lines, and tables that say "this device is attached to this power line".
  
  Having *tables* in firmware seems like a great thing, for everyone except vendors who think it'll destroy their ability to "differentiate" and "value add". Why give vendors a language to drive arbitrary non-standard functionality?
  
  In conversation Thursday, 02-Nov-2023 14:20:46 JST permalink
  
  clacke likes this.
- Embed this notice
  Matthew Garrett (mjg59@nondeterministic.computer)'s status on Thursday, 02-Nov-2023 14:20:50 JST Matthew Garrett
  in reply to
  - G
  @stark That kind of thing ends up being *very* platform dependent. Say you have a GPU - the GPU driver has no idea how the power line to the GPU is controlled, because that's up to how it was wired up in the specific machine, and the control mechanism is likely also hardware-specific (is it controlled via the embedded controller? Is there some other power controller that needs to be spoken to? That sort of thing)
  
  In conversation Thursday, 02-Nov-2023 14:20:50 JST permalink
- Embed this notice
  G (stark@mastodon.mit.edu)'s status on Thursday, 02-Nov-2023 14:20:51 JST G
  in reply to
  
  @mjg59 thanks. ACPI was always a mystery to me.
  But one thing I still don't get. The kernel needs a driver to talk to every device and that driver needs to know how to do everything else. Why is turning the device on and off so uniquely tricky that it would be a problem to do that in the driver too?
  
  In conversation Thursday, 02-Nov-2023 14:20:51 JST permalink
- Embed this notice
  Matthew Garrett (mjg59@nondeterministic.computer)'s status on Thursday, 02-Nov-2023 14:21:04 JST Matthew Garrett
  in reply to
  - G
  - Josh Triplett
  @josh @stark Intel tried https://en.wikipedia.org/wiki/Simple_Firmware_Interface and basically nobody was interested
  In conversation Thursday, 02-Nov-2023 14:21:04 JST permalink
  Attachments
  1. Domain not in remote thumbnail source whitelist: upload.wikimedia.org
    
    Simple Firmware Interface
    
    Simple Firmware Interface (SFI) is developed by Intel Corporation as a lightweight method for firmware to export static tables to the operating system. It is supported by Intel's hand-held Moorestown platform. SFI tables are data structures in memory, and all SFI tables share a common table header format. The operating system finds the system table by searching 16 byte boundaries between physical address 0x000E0000 and 0x000FFFFF. SFI has CPU, APIC, Memory Map, Idle, Frequency, M-Timer, M-RTC, OEMx, Wake Vector, I²C Device, and a SPI Device table. SFI provides access to a standard ACPI XSDT (Extended System Description Table). XSDT is used by SFI to prevent namespace collision between SPI and ACPI. It can access standard ACPI tables such as PCI Memory Configuration Table (MCFG). SFI support was merged into Linux kernel 2.6.32-rc1; the core SFI patch is about 1,000 lines of code. Linux is the first operating system with an SFI implementation. Linux kernel 5.6 marked SFI as obsolete.SFI support was removed in Linux kernel 5.12....
  clacke likes this.
- Embed this notice
  Matthew Garrett (mjg59@nondeterministic.computer)'s status on Thursday, 02-Nov-2023 14:21:13 JST Matthew Garrett
  in reply to
  - G
  - Josh Triplett
  @josh @stark And, well, the fundamental problem is still that you need to identify all possible scenarios people might reasonably want to implement in advance, and it's clear the industry isn't interested in that
  
  In conversation Thursday, 02-Nov-2023 14:21:13 JST permalink
  
  clacke likes this.
- Embed this notice
  Matthew Garrett (mjg59@nondeterministic.computer)'s status on Thursday, 02-Nov-2023 14:21:19 JST Matthew Garrett
  in reply to
  - Andrew Zonenberg
  @azonenberg The EC doesn't have access to the CPU's buses, SMM does.
  
  In conversation Thursday, 02-Nov-2023 14:21:19 JST permalink
  
  clacke likes this.
- Embed this notice
  Andrew Zonenberg (azonenberg@ioc.exchange)'s status on Thursday, 02-Nov-2023 14:21:20 JST Andrew Zonenberg
  in reply to
  
  @mjg59 I'm thinking more about things like SMM (and ME, and even UEFI).
  What does SMM do that the EC couldn't do better?
  All of the hardware I design these days has a MCU that does management stuff (sometimes two, one really low level one for power/reset sequencing and one that comes up later to do everything else) and then talks to an FPGA that does most of the work of the board.
  The FPGA doesn't care about polling sensors or controlling fan speeds or anything else that's needed to make the system work. It just lives off in its own temperature-controlled world and does its thing, and has a SPI interface to the MCU when it needs something.
  
  In conversation Thursday, 02-Nov-2023 14:21:20 JST permalink
- Embed this notice
  Matthew Garrett (mjg59@nondeterministic.computer)'s status on Thursday, 02-Nov-2023 14:21:23 JST Matthew Garrett
  in reply to
  - Andrew Zonenberg
  @azonenberg It being entirely in the EC would be basically equivalent to ACPI except we'd have less insight into what's going on behind the scenes
  
  In conversation Thursday, 02-Nov-2023 14:21:23 JST permalink
- Embed this notice
  Andrew Zonenberg (azonenberg@ioc.exchange)'s status on Thursday, 02-Nov-2023 14:21:24 JST Andrew Zonenberg
  in reply to
  
  @mjg59 What I don't understand is why all of this functionality is implemented on the CPU at all, rather than on the EC/BMC.
  (and then have the EC/BMC expose a standardized, abstracted API to the OS for things like voltage/temperature sensors)
  Power management also seems like the kind of thing that makes more sense to have been architected as an out of band feature that software was unaware of except for giving high level directives to the EC.
  
  In conversation Thursday, 02-Nov-2023 14:21:24 JST permalink
- Embed this notice
  Matthew Garrett (mjg59@nondeterministic.computer)'s status on Thursday, 02-Nov-2023 14:21:41 JST Matthew Garrett
  in reply to
  - Michael Hudson-Doyle
  @mwhudson there's no inherent reason not to have a standard way to read that from the firmware or first-stage bootloader (effectively equivalent)
  
  In conversation Thursday, 02-Nov-2023 14:21:41 JST permalink
  
  clacke likes this.
- Embed this notice
  Michael Hudson-Doyle (mwhudson@mastodon.social)'s status on Thursday, 02-Nov-2023 14:21:44 JST Michael Hudson-Doyle
  in reply to
  
  @mjg59 well another difference is that (practically speaking) you have to ship your device tree as part of the OS right?
  
  In conversation Thursday, 02-Nov-2023 14:21:44 JST permalink
- Embed this notice
  Sven/Sarah (pascaldragon@metalhead.club)'s status on Thursday, 02-Nov-2023 14:22:00 JST Sven/Sarah
  in reply to
  
  @mjg59 one issue with ACPI - though it's an issue of specific vendors instead of ACPI in general - is that on Qualcomm ARM Laptops Qualcomm decided to move quite some logic into their Windows drivers instead of ACPI due to the Windows ASL parser being buggy compared to ACPICA which makes running alternative operating systems using purely ACPI on these laptops a bit annoying as one needs to reverse engineer the drivers to discover the interactions between devices 🙄
  
  In conversation Thursday, 02-Nov-2023 14:22:00 JST permalink
  
  clacke likes this.
- Embed this notice
  Matthew Garrett (mjg59@nondeterministic.computer)'s status on Thursday, 02-Nov-2023 14:22:09 JST Matthew Garrett
  in reply to
  - Poul-Henning Kamp
  @bsdphk you want to put one of those in-kernel?
  
  In conversation Thursday, 02-Nov-2023 14:22:09 JST permalink
  
  clacke likes this.
- Embed this notice
  Poul-Henning Kamp (bsdphk@fosstodon.org)'s status on Thursday, 02-Nov-2023 14:22:12 JST Poul-Henning Kamp
  in reply to
  
  @mjg59
  The worst part of ACPI is that they invented at stupid crappy unique language for it.
  The world would be a better place if they had just picked an already existing scripting language like Tcl, Lua or even Python...
  
  In conversation Thursday, 02-Nov-2023 14:22:12 JST permalink

Public

Conversation

Notices

Feeds