@whitequark no particular comments on the gc algo, you're using gc in a a very different way. we didn't attempt to do any object dedup for pages, whole site builds were just rsynced from build server to file servers. mysql was the source of truth for which file servers held site replicas, new builds were allocated to the N file servers with most free disk at time of build, and each file server ran GC to clean up dangling replicas at some point in time after the replica records were deleted from mysql. idea was we'd need some kind of gc anyway because we can't delete from mysql and the fileservers in one transaction, so skip synchronous fs deletions entirely and make gc load bearing thus continuously proving it works