@regehr There's an insidious problem on targets where the ISA has a few instructions which rely on alignment (e.g some SSE/AVX and even recent ARM). Unaligned access works until the optimizer figures out a way to save a cycle by using the faster vector op, and then it breaks. This turns into a heisenbug where any attempt to track it down breaks the optimization.