Windows wants us to handle unicode differently, but unless we want to allow unicode filenames, we can ignore it.
There will be two ways to switch from iso latin 1 to unicode.
The first way is to use full width characters. We'll need to replace std::string with std::wstring or something similar (rumor has it that on some systems, std::wstring does not exist or is identical to std::string or uses 16 bit characters that are not wide enough). We'll have to replace most chars to the appropriate wide chars. We'll have to convert stuff to UTF-8 on input and output, because that's what the outside world expects. Memory usage will increase a lot.
You'd think we'd get easier string handling algorithms by it because every token in the string will be one character, so the string length in tokens also is the character length; but that is wrong. Accents can be handled by combining characters. ä can be represented as `a` followed by a token that says `apply " to the previous character`, and an arbitrary number of those can follow.
The second way is to use UTF-8 internally. It'll save tons of memory. "We'll have to adapt all string handling functions to cope with that! A nightmare!" I hear you scream. But think a bit: in how many places do you actually need to decode UTF-8? UTF-8 preserves ASCII in both ways: ASCII characters appear as they are in an UTF-8 stream, and if you see an ASCII character in a UTF-8 stream, it really is one. So searches for "/", "\n" and ":", which is most of what we do, will continue to work as they should.
There are two places where we need to be aware of UFT-8: the obvious one is the font rendering, it'll need to decode the stream. The other one is string formatting; only there, we need to know the exact number of characters, not bytes, a string has up to a certain point to align the score table and stuff. Luckily, we already have embedded color codes that pose exactly the same problem, so we have half a handful of functions that handle formatting and that are used. So we'll only have to adapt those functions.
I guess the choice is obvious

This plan also means that casts from xmlChar * to char * to std::string are safe. I came to the conclusions above while searching for a good way to do a safe conversion
