I will admit that until I saw the CouchDB post, the scenario of 'program writes but crashes before fsync, restarted program re-reads the just written data and assumes it's safe on disk instead of just in the kernel buffer cache' had not occurred to me. I guess everything that reads a WAL or other recovery file should immediately fsync() it on startup.