@nova thanks for the writeup!
Sitting in my armchair, I'm wondering about the end cause being bad hardware, mostly because I had a slightly-related failure on my own home setup.
My guess is there was I/O saturation on the largest SSD (looks like they were SATA) due to capacity imbalance. When the largest SSD hit the limit, it could block kernel queues across the system due to the shared ZFS layer.
I'm wondering if hard partitioning postgres and NFS could have protected postgres.