I’m sure I’ve read before that “floating point math is not associative” and my eyes glazed over. And I know that floating point math IS deterministic… i.e. same numbers in = same result every time.
I spent a week understanding why a Spark job was not deterministic. The crux is that in Spark, values are spread across workers in a non deterministic way. So then the order of numbers fed into math operations varies from one run to next. Which makes output non-deterministic.