Embed Notice
HTML Code
Corresponding Notice
- Embed this notice@lucy I would like to be clear, because I shat a bunch of words onto the page and it's going to look argumentative: I don't think you're wrong in general; some of the specifics I disagree with. I do think some of what you want is wrong to try to get into C, but mainly I don't know what you want so I'm trying to figure that out.
:dmr: Dennis Ritchie had a proposal for variable-length arrays, but by the time he had the idea, it was already out of his hands and in the hands of the standards committee, and they hated everything he ever suggested.
> slices and bounds checking would be nice.
C doesn't have arrays; it has some syntax around pointers. If you try to give it "real" arrays, then the runtimes for all of these other dingus languages stop working, because sometimes what you want is a contiguous region of memory.
Maybe you have a contiguous region of memory (framebuffer) and you don't know how big it is at compile-time: now there's a runtime library the compiler has to hook into. The way this problem is solved in C is that you get a segfault: the OS has built-in memory safety. Or say you've got to call a special instruction to get the dimensions of the framebuffer stored in a register: if you couldn't fudge types in C, you'd have to implement most of that in assembly instead of just one or two instructions.
C is the machine, though, and the machine doesn't have bounds-checking because it doesn't have bounds. Sometimes it has an MMU, sometimes it has ridiculous segment registers and an A20 line and stuff like that, sometimes it has memory-mapped I/O registers. If what you want is a "real" array, you probably don't want C for this task.
> Or, (de)referencing should be implicit in most cases.
This would be a disaster for cases where you care about that kind of thing; for cases where you don't care and are fine with it being implicit, you should probably not use C.
A nice thing about C is that you can translate what you've written into what it's doing: you see an `->` and you know that there's a read. This matters when it's timing-sensitive (say, an :terry: interrupt handler :terrylol2:) or time-sensitive (factoring prime numbers). Importantly, it also matters when you are debugging, because you can see a pointer is being read through. A memory-safe language, you don't have that concern.
I had a bug a long time back: it was segfaulting...nowhere. Just initializing a variable that wasn't even a pointer. Pretty reliably, too: same function, same variable, every time. I blamed the compiler (gcc treats ARM like a second-class citizen still) and looked at the exact address. No arbitrary pointers being dereferenced, it was just pushing to the stack...just pushing to the stack. I followed the stack pointer around, expecting that it had been corrupted...somehow. Nope. I had linked against libSDL, this build of SDL linked against libpulse, the kernel defaulted to 16kB of stack per thread, libpulse was eating more than 16kB of stack just for having been initialized. (Giving it more stack via sysctl fixed it, telling SDL to use ALSA instead of Pulse fixed it, turning off sound fixed it, and that's what I eventually went with, because I didn't need sound.) This was a pain in the ass, cost me half a day, and it was one of the very few implicit mechanic. And, you know, you add n implicit mechanics, you get n! complexity. Bounds-checking doesn't save you from this kind of thing, and worse, this is stuff that you've got to worry about *somewhere*: high-level languages are going to have problems like this, and there has to be a region where you can debug it, there has to be something between the bytecode VM's runtime and the machine. If it's not C, then it's just machine code. There has to be a language that is just the machine: no more, no less. If this stuff worms its way into C, then there has to be something under C.
> or when passing something that's supposed to be a pointer then just pass it as a pointer.
C++ has references in declarations and they are a nightmare; you see codebases where the style guide forbids their use. It makes debugging a nightmare. `f(&x)` means you get to see right there that you're passing an address.
But this is a feature list. What is "safety"? Page faults don't happen? That's what I'm trying to get at; I think the machine is safe already as long as you don't try to eat any of its components and it doesn't catch fire. Like, it has to mean something; if it means bounds-checking, that's a feature: you must mean that you are safe from running off the end of the array or safe from page-faults, something.