User forums > Help
Source Code encoding in UNICODE version
mauser:
--- Quote from: anonuser on January 01, 2006, 07:54:05 pm ---UTF-8 is ascii so it shouldn't have any problems with it.
Now when you bump up to UTF-32 or UTF-16 that's when things get interesting.
--- End quote ---
Well utf-8 is not ascii. it is ascii compatible in some way.
And i think i just need to swith to GCC. that assumes utf-8 input by default.
Thanks
Leviathan:
ascii is a subset of utf-8. All symbols in ascii are present in utf8 with the same value.
But utf-8 is a multi-byte encoding, so a symbol may consist of 1, 2 or even 4 bytes, so obviously ascii doesn't contain all symbols utf-8 does.
Now, more on topic: What "Der Meister" said is absolutely correct. Windows uses utf-16 internally, therefore its support for utf8 is limited. Also, it expects a BOM (Byte order mark) at the beginning of a unicode-file. All other textfiles are assumed to be (extended) ascii.
Unix on the other hand doesn't expect a BOM, so the first 2 bytes are interpreted as symbols.
You have 2 choices: Either stick to ASCII like "Der Meister" suggested, or write a (very simple) program to quickly add or remove the Signature (0xFEBBBF) to/from files.
mandrav:
You should try with r1648. This should be fixed (for now).
killerbot:
?? why for now ??
otherwise this bug can be closed.
http://sourceforge.net/tracker/index.php?func=detail&aid=1384513&group_id=126998&atid=707416
mandrav:
--- Quote from: killerbot on January 02, 2006, 06:55:56 pm ---?? why for now ??
otherwise this bug can be closed.
http://sourceforge.net/tracker/index.php?func=detail&aid=1384513&group_id=126998&atid=707416
--- End quote ---
Well, because I didn't add any code to handle "strange" encodings, I just asked not to do any conversion on the charset. I believe it is fixed now, but I 'll wait a while until more people have tested it.
Navigation
[0] Message Index
[#] Next page
[*] Previous page
Go to full version