@mint@ryona.agency @nihil@misskey.gg it could always be something else than a race condition, but it seems like it's that judging off of these lines
Feb 27 22:08:15 nixos systemd[1]: Starting Cleanup of Temporary Directories...
Feb 27 22:08:15 nixos systemd[1]: systemd-tmpfiles-clean.service: Deactivated successfully.
Feb 27 22:08:15 nixos systemd[1]: Finished Cleanup of Temporary Directories.
Feb 27 22:08:24 nixos kernel: amdgpu 0000:09:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x00000028 SMN_C2PMSG_82:0x00000000
Feb 27 22:08:24 nixos kernel: amdgpu 0000:09:00.0: amdgpu: Failed to enable gfxoff!
Feb 27 22:08:29 nixos kernel: amdgpu 0000:09:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x00000028 SMN_C2PMSG_82:0x00000000
Feb 27 22:08:29 nixos kernel: amdgpu 0000:09:00.0: amdgpu: Failed to enable gfxoff!
Feb 27 22:08:30 nixos kernel: amdgpu 0000:09:00.0: [drm] *ERROR* [CRTC:95:crtc-1] flip_done timed out
Feb 27 22:08:31 nixos kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled seq=3400, emitted seq=3401
Feb 27 22:08:31 nixos kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process pid 0 thread pid 0
Feb 27 22:08:31 nixos kernel: amdgpu 0000:09:00.0: amdgpu: GPU reset begin!
Feb 27 22:08:31 nixos kernel: amdgpu: Failed to suspend process 0x800cFeb 27 15:09:35 nixos kernel: mce: [Hardware Error]: Machine check events logged
Feb 27 15:09:35 nixos kernel: mce: [Hardware Error]: CPU 9: Machine Check: 0 Bank 5: bea0000000000108
Feb 27 15:09:35 nixos kernel: mce: [Hardware Error]: TSC 0 ADDR 1ffffc0754f4a MISC d012000100000000 SYND 4d000000 IPID 500b000000000
Feb 27 15:09:35 nixos kernel: mce: [Hardware Error]: PROCESSOR 2:870f10 TIME 1709068166 SOCKET 0 APIC 3 microcode 8701030
Feb 27 15:09:36 nixos kernel: MCE: In-kernel MCE decoding enabled.
But the symptoms are
1. The screen turns black
2. Stays black
3. Shuts off suddenly
With ssh I'm able to reveal those logs, and when rebooting I get a MCE error. Luckily the MCE error seems unrelated to my GPU itself, it seems to be that wlwifi kernel driver freaks out at the same time as the GPU resets oddly. I have dmesg logs as well but they are two big to fit in this post
Embed Notice
HTML Code
Corresponding Notice
- Embed this notice
fuggy@skippers-bin.com's status on Sunday, 03-Mar-2024 07:59:12 JSTfuggy