@Herman_Hetherington@p :dracula: They said that they needed a stealth soldier, so I put my hands on the hibachi hot-plate at Benihana and burned my fingerprints off. They will never find me. :dracula2:
@cjd@dcc@p Nothing weird, but that's the way it's been: nothing weird and then suddenly it locks up. FSE's up 1:35, so we'll see. Before this, uptimes were a little over a day, twice, then it was up less than six hours.
I'm just still surprised there was never any email alerts setup for your PSU going down. Did both went down at the same time, or did one fail and the backup eventually died?
Contention for the memory bus is the biggest performance hit on $current_year CPUs and aside from that, the biggest performance bottlenecks for Pleroma are, in order, disk I/O and network I/O. The CPUs on the box mostly sit idle, meaning that more hardware threads wouldn't accomplish anything but heating the machine up.
I suspect that it's faster with less NUMA-wrangling, but have not benchmarked it. I just know that on this workload, I could probably disable half the physical cores and still have no trouble.
> interesting amount of misinfo sticking around from two decades prior
Arthur Whitney's microsecond trading bots still run with only one core enabled, and there was another hyperthreading security problem last year or the year before.
Because the box was flaky, I abandoned plans to sell hosting. This turned out to be a good thing: the benefit of the VMs all being operated by either me or people that I know personally is that I don't have to worry so much about anyone trying to do something shady to compromise the box. So most of those problems don't apply.
But what I do have is measurement, right, like I can see that it rarely maxes out any of the cores. It is at least not slower without hyperthreading, because there are idle cores and most of the cores don't even approach 100%. And there's the statistic, right, 90% of the time is spent on SpecEx while waiting for the memory bus, and even if I only half-believe it, I can measure. (Also I have a lot of thoughts on the size of icache and mostly isolated tests around them.)
I am interested in your thoughts on this. Don't take this the wrong way, but "misinfo sticking around" seems like you are curious whether my thoughts are wrong or not, which is an evaluation of a thing: I'm more interested in what your thoughts are, that's the thing and the evaluation of the thing is a degree removed.
And that's about FSE specifically rather than CPUs in general. I don't think anyone can say too much to me about FSE specifically, but CPUs are fabulously complicated nowadays so anyone that knows anything will usually know some things I do not.
@p@i@dcc@p Doesn’t hyperthreading also use features that speed up code execution? I remember hearing something along those lines during the whole SPECTRE debacle.
@anonymous@dcc@i@p Yeah, I don't trust hyperthreading as a consequence of this specifically, but I'm not that worried about someone with local access causing a problem.
@p@dcc@p i was interested in your reasons, and your reasons/thoughts are valid, and not just a leftover habit, like the flamewar i've had to endure at workTM, which sparked the idea to ask since mentioning it seemed out of place, also plan9 just seemed funnier than boomer when choosing an 'ism
>more hardware threads wouldn't accomplish anything but heating the machine up is probably false since the box is mostly sleeping on IO, but does lower the ceiling of power use possible, mostly as a side effect of limiting the amount of work it could do at 100% load
> and your reasons/thoughts are valid, and not just a leftover habit,
Ah, okay, yeah. I think there's a lot of, like, leftover habit that is not worth considering.
> flamewar i've had to endure at workTM,
This is a fuckin' thing. You don't get fired from fedi for saying dumb shit about computers, but for some reason, people on fedi seem to know what they're doing way more often than people at work.
> plan9 just seemed funnier than boomer when choosing an 'ism
:autismapproved:
> probably false since the box is mostly sleeping on IO,
It's plausible. The machine is waiting on I/O on a typical workload but on a CPU-heavy workload, it's going to be memory. It is more or less why smart register allocation overtook stacks. The problem is that most programs don't fit in icache. Speaking of Plan 9 autism, there was a research fork for multi-core systems, I forget the name, but they basically relegated the OS to a single CPU and then locked threads to cores and relied on, except for some tasks (recollection hazy) I think they forced cooperative multitasking. Their results were encouraging but under a regular Unix-style workload where you have a large number of short-lived processes, not ground-shattering. (Maybe the problem was just that they had 8-way machines instead of Threadrippers and it's worth revisiting.)
> but does lower the ceiling of power use possible,
This is old but is very good: http://yosefk.com/blog/the-bright-side-of-dark-silicon.html . I have never regretted time spent reading anything YosefK wrote. I think if you mash that piece together with his piece on FPGAs, and then you toss in Chuck Moore's GreenArrays designs (effectively, the FPGA part of the FPGA loses in terms of power density and performance to arrays of extremely tiny, like a tiny stack and enough memory for 256 instructions, pre-packing), you can kind of see the path from here to compute-dust. (Maybe "dust" is an exaggeration, but "pebbles" look like they can be achieved today.)
I have a lot of things to say on this and the network being the computer and how much of the brain's computation is dendrites and axons and chemical side-channeling rather than neurons doing S-curves like the ANN/RNN people are still fixated on, but we will be here for hours once I get on this topic and I think you can probably see where I'm going with that.
> side effect of limiting the amount of work it could do at 100% load
Yeah, I'd probably do it differently if I wanted to set the record for biggest Mersenne prime, but then there'd be no reason to put it into a datacenter near the backbone. I think I'd shove a GPU in there if I were doing Peertube. (graf has been experimenting with the GPU-accelerated Postgres patches, I should ask him how that went.) For what the box is doing, I basically took my observations from sitting on shit-tier Frantech boxes and tuned appropriately.