Embed Notice
HTML Code
Corresponding Notice
- Embed this notice@white_male No, as the lowest byte on UTF-16 may be larger than 128 and it may even be the NULL char (which truncates C character arrays) and UTF-16 characters may 4 bytes wide.
Aside from a few exceptions like the byte order mark, all valid UTF-16 character sequences map with a UTF-8 codepoint, but you'll need to use something like GNU iconv to convert it.
Still, UTF-16 is a useless encoding, as it leads to a lager filesize than UTF-8 almost always (even for books in Chinese characters, as typically there is much more ASCII formatting than text in book formats as ASCII characters double in size when encoded as UTF-16), it's still multi-width (2 or 4 bytes wide), is not self-synchronizing and has big endian and little endian variants.