1. He explicitly selected very good AI images and very bland human images.
2. Anyone could take the test, so it wasn't a test of artist / art enjoyers, but of the general tech public of the type who read Scott Alexander.
So at best, he showed that people who have an average knowlegde of art (i.e. not much) tend to find it very hard to distinguish between pre-picked AI/human images.
His friend who is good at telling apart AI and human art shows that, with some knowledge, it's not too hard to distinguish between many types of images (although I'd have put pic related squarely into the AI park, but it's human, apparently).
It becomes even more obvious with the religious / pastoral images, where AI images usually have symbolism that doesn't make sense, while the human images are coherent. If you have no idea about the logic of these paintings, they might be hard to tell apart, but if you do it's easy.
Sometimes the images also mix techniques or styles (usually choosing a different one for the background than for the subject), which might not be noticed by the average viewer but is immedieatly apparent to someone familiar with it.
I think most of these issues will vanish in the future even for basic prompting, but I'm not sure Scott's experiment demonstrated much more than "AI can make pretty nice pictures", which is somewhat obvious.