@feld yea that’s certainly one of the main challenges, that and logging imho should be taught with more emphasis.
I’ve been happy moving most of my poorly maintained services to daemon(8). But I’ll agree there really isn’t one approach to rule them all.
One thing I really hate about daemontools are the binary logs. I ran a large scale erlang cluster managed by daemontools and it was always so awkward to see what was going on, which was usually during an emergency. Interestingly that’s my least fav feature of systemd too.
Or all the kernel tuneables, the importance sometimes of CPU affinity to prevent cache thrashing, aligning storage block sizes to the data of the application... it goes on and on
You could sit at the bar with another seasoned sysadmin and every time you think of a new important OS knob that needs to be considered for application performance at scale the other person has to drink and you'll both end up in the hospital
@pete_wright@feld I've dealt with this regularly in my career. I've helped build infrastructure that served 150M end users with streaming media. Like, listen to me sometimes at least. I feel dismayed sometimes that there are developers who refuse to learn the first thing about linux when it is their target deployment platform. If you don't know what a ulimit is and you're trying to write software that handles massive concurrency you're gonna have problems
Thats awesome to know there are folks out there who've made that transition. I and I think it's great when it works out - lots of my mentors where sysadmins who were C programmers because that was so critical for supporting new hardware.
But, to be honest I get tired of how quickly people discount the experience of people with decades of systems/network/support experience in favor of developers who speak the same language as senior management who are mostly people with business/softwaredev backgrounds.
I see this pretty frequently when working in places trying to implement the DevOps philosophy, but maybe I'm just unlucky.
@feld@pete_wright hi! former sysadmin who learned to code a decade ago and did devops/SRE for a long time. Now operate more like a software/infra architect turning eng manager. I've personally seen software engs make very good infra/ops people - but they were the ones who actually tried to learn their target system deeply.
Sounds almost exactly like a customer I had at a previous job...
Remember stupid shit like "Apache htaccess files being enabled slows down the web cluster because every request causes a check for .htaccess file starting at /.htaccess and your website is 20 dirs deep so even your RAID runs out of IOPS once you're serving a couple hundred reqs/s" ?
Or "your app is calling gettimeofday() every 30 nanoseconds"
Stop abusing computers please 🙏 they have a family too.
@feld@pete_wright actually funny story specific to this exact thing with CPU affinity. I worked at a place that had 20,000 iptables rules and didn't understand why connections were slow. They also didn't realize haproxy could split TLS offloading and traffic routing across CPUs. They had low-traffic but it operated like it was intensely under load all of the time because of this.
there is a flip side to this too. being on the support side of things gives you a unique perspective on the beauty of the KISS principle. i think people underestimate the power of a few well chosen Unix components. it may not get you famous on the conference circuit, but it also seems to help me sleep through the night more frequently.