@icedquinn @Suiseiseki @wolf480pl @a1ba TBH diacritics are less of an issue - most operations on strings can easily work on a per-codepoint basis, such as word wrapping - you just need to handle diacritics and other combining codepoints as if they're a word character.
And for stuff like line length computation, you need to take the different per-character width of your font into account anyway.
What's really annoying is string comparing, as you now have to apply a normalization first...