Embed Notice
HTML Code
Corresponding Notice
- Embed this notice
pistolero (p@fsebugoutzone.org)'s status on Wednesday, 09-Apr-2025 06:40:58 JST pistolero
@lispi314 @Suiseiseki @jeffcliff @scathach @sicp
> Think tools on one's own system instead.
I already was. I mean, this is the kind of shit I do all day every day, for money, for fun, for nothing.
Hamming again: "The purpose of computing is insight, not numbers."
There are one-off scripts you write to answer a question (bashing them into a REPL or actually putting them into a file, whichever), and then there are pipelines.
> It is a *lot* more common for serialization pipelines to become a bottleneck in local data processing, particularly when the storage backing is sufficiently fast to not be the bottleneck.
I have never seen serialization become the bottleneck in my life, except in the rare case where the only processing something is doing is translating between two different methods of serialization. The processing you're doing has to be trivial and serialization has to be incredibly expensive for serialization to be the bottleneck. I have not ever seen a case where serialization was slower than the disk or the network.
FediList is moving all of this data around (~100k reqs/hour) and a lot of it goes through *Ruby* and serialization/deserialization is still not the bottleneck.
> The point where using a program instead of a pipeline to have something barely complex complete within a reasonable amount of time comes pretty quickly.
I shove a 20GB log file through awk and don't have this problem. Fancy NVMe storage, ARM CPUs (and gcc still sucks for optimizing ARM), and *still* everything is I/O-bound.
> Communication efficiency matters and serialized text pipes are not particularly efficient
You keep making these statements without qualifying or quantifying them, and they are broad, vague, useless.
> There is a reason many messaging libraries use shared memory as a transport when possible instead of local sockets.
There's a reason I usually dismiss this kind of statement out of hand.
There are a few reasons to use shared memory instead of pipes: the primary reason is superstition. Joe Armstrong noted that copying is faster than locking for most of the things he was doing, but that even if he benchmarked it, and people would still tell him he was wrong. Superstition.
cmpxchg is faster than a syscall, sure, if you only look at it in isolation, but it depends on the shape of your workload. Odds are better that the syscall overhead is buried because either your pipeline is I/O-bound or one of the processes is more expensive than the others and the others are waiting with full buffers (and completely idle because they're not even scheduled, rather than busy-waiting polling some shared memory) while the expensive process grinds (and because it is eating the data and spitting it back out, it doesn't have to wait to acquire a lock, nor get involuntarily preempted so that the other processes can cmpxchg and nanosleep).
But, look, you are certain of it: build it and get back to me.
> Only if you have not designed the compiler and library to provide an interoperable type usable from both Common Lisp and the hosted language natively without needing further transformations.
The perpetual lament: "this machine was written to run C and Unix, Lisp would be faster if the CPU and the OS and the compiler were designed from the ground up to be faster for Lisp". It's always asserted and never backed up with a working CPU design. We've got FPGAs all over now, you can do this at home, implement type-tagging in the MMU, go nuts. Build it and get back to me.
Chip designers disagree: http://www.yosefk.com/blog/the-high-level-cpu-challenge.html , https://yosefk.com/blog/high-level-cpu-follow-up.html .
But if we narrow it just to talk about a binary serialization scheme, then you're moving the overhead to the barriers where the data enters and exits the system (and that's very generously assuming that you've got zero overhead on things like "pointers" so just a memcopy and you can use the data-structure as-is; I'll believe in that serialization format when I see it).