How it started: sequential file reads from the array at ~160MB/sec (ZFS native encryption}
How it's going: sequential file reads from the array at ~900MB/sec (ZFS on LUKS)
It seems that GCM makes the Atom C3958 a sad SoC.
How it started: sequential file reads from the array at ~160MB/sec (ZFS native encryption}
How it's going: sequential file reads from the array at ~900MB/sec (ZFS on LUKS)
It seems that GCM makes the Atom C3958 a sad SoC.
@ryanc That chip apparently has AES-NI, so AES-GCM should, if code is using it, be expected in the ballpark of ~2 cycles per byte, i.e ~1GB/sec.
@jripley I don't know exactly what the problem is, but it seems to be terrible at PCLMULQDQ
@jripley It also has a QuickAssist accelerator, but I can't get ZFS to use it for GCM - but I do have LUKS using it.
@jripley I get 700MB/sec single core doing AES in counter mode.
@ryanc I’m suspicious maybe Intel checked the “AES-NI” box of ISA support in Atoms, while the underlying implementation is whatever is enough to pass validation only.
@jripley I actually managed to get the QAT accelerator working for LUKS, but couldn't get ZFS to use it even though it supports AES-GCM.
@ryanc Spent a few minutes puzzling over why it's so bad, and it looks like that era of Atom was just before Intel greatly improved latency/throughput. Also I think it has only one pipe doing all the above, mostly stalled. I got into this thread because one of the last things I did at my prior employer was hyper-optimize AES-GCM for their cores, and they run bytes-per-cycle, not cycles-per-byte. Literally 10x faster.
GNU social JP is a social network, courtesy of GNU social JP管理人. It runs on GNU social, version 2.0.2-dev, available under the GNU Affero General Public License.
All GNU social JP content and data are available under the Creative Commons Attribution 3.0 license.