Decreased it from 200 to 100, also increased random_page_cost from 1.1 to 1.25. I'd take a couple extra milliseconds the queries have to run over having random 500s.
Things seem to be improving, nice work mint. Why would such a problem appear as abruptly? Was db query time increasing over time and it just hit the threshold of not being able to serve them any more?
@lauralt Don't think so, the disk seems to be in good health even if a bit slow, and there's plenty of space as well now that I've booted salon off to an $20 SSD. I've been experimenting with postgres config for a while, sometimes not changing some corresponding values (e.g. it seems like you need to decrease work_mem when increasing max_connections). Now I did it again after yet another pg_repack session, but paying somewhat closer attention this time.
@munir@lauralt Agency, salon and accela run on the same server at home, with the former two proxied to a frantech VPS; camp and rawr onion run on the same frantech VPS directly. plagu.ee is sort of a staging server where I test changes before commiting it. Plus I got root access to nukie's server and a couple more I'm not going to mention.