@b0rk I don't have any useful links other than the one you probably looked at first:
https://en.wikipedia.org/wiki/Byte
I remember (but can't find) a list of alternate historical sizes. My favorite was the 13-bit "baker's byte".
Just-so stories I can't back up with sources:
A power of two is convenient as a memory size. It takes three bits to address the bits in an 8-bit byte, but it would take four (with the last one partially wasted) to address the bits in a 10-bit byte.
If you want a character set that includes upper- and lower-case English letters, digits, and some punctuation marks, you're going to need at least 7 bits. I believe the 8th bit was originally used for error detection.