:bc: Attention Beige Party-goers! :bc:
In the past month there have been two incidents that I'm aware of where the web server was hanging and the site was unresponsive for several minutes. One of those seemed to resolve on its own eventually; the other I happened to notice as it was occurring and resolved once I gave the web server a hard reboot. It's possible that this has happened more than twice and I just haven't noticed it.
Looking at the logs I'm not seeing anything specifically failing, so I think what's happening is that occasionally traffic spikes are overwhelming the system resources. If left alone, the server can eventually recover but it will remain unresponsive until it does.
We have grown a lot in the last two years so this isn't totally surprising. I have scaled up the web server from time to time in order to account for the processing demands of increased traffic, but I'm reaching the limits of what I can do with a single server.
So, what I'm planning to do is spin up an additional web server and put them both behind a load balancer. This has a few advantages. If one server is overtaxed then traffic can be shifted to the other. It also means that for simple server updates (ones that don't involve database schema changes), I can take one down to run the update while the other stays up. In other words, in most cases I won't have to take the entire site down to do server maintenance.
This all sounds great, but as usual I have no idea what I'm doing. So I'm going to move slowly to set things up in the background and make sure I'm 100% certain everything is going to work before I push it live. Until I have this new solution in place it's possible things could be a little bumpy. So far the server hanging has been an intermittent issue, which is why I want to take care of it now, before it becomes a bigger problem.
As always, thank you for bearing with me!
Beige-bless 🇧🇧