We're struggling to figure out a good way to migrate Mastodon media storage from one server to another... Mastodon stores files in millions of directories by hash (kinda like aaa/bbb/ccc/ddd) which means doing something simple like an initial rsync and then taking Mastodon down for a quick resync is impossible. We initially were going to go this route but after the initial file listing took more than 24h we cancelled it and gave up.
So now we're looking at just copying the raw filesystem over, but if we want to do it without taking mastodon down for the entire sync we need to come up with a way of copying it and resyncing the changed blocks afterwards.
One way could be to use overlayfs. Remount the old volume R/O, create a temporary upperdir and create an overlay between them. Then, copy the R/O image to it's new home, expand it or whatever, and apply the upperdir onto it. This way we only need to list the directories that actually had writes. Special care will need to be taken to ensure we delete any files that have overlayfs tombstones. IDK if anyone has ever done this before.
Another way could be to use devmapper snapshots to create a new COW-backed volume, rsync the R/O underlying block device over and then apply the COW to the new volume with snapshot-merge. We tried testing this out and caused devmapper to die horribly and spit out kernel bug log lines, so we had to reboot and e2fsck for 2 hours.
At this point it might be better to just take everything down for as long as it takes. I'm extremely annoyed at Mastodon's file structure making it impossible to move without major downtime. Their solution just seems to be "use S3 lol". It would probably take 24 hours (8TB at 1Gbps is roughly 17 hours). We could shrink it first since we don't use all the space, but resize2fs will take a while as well.
If anyone has any tips or ideas for doing it with minimal downtime I'd like to hear them. Or if you're an uwu.social user and don't care about extended downtime I'd also like to hear your thoughts too.