User forums > Help

Source Code encoding in UNICODE version

<< < (2/4) > >>

mauser:

--- Quote from: anonuser on January 01, 2006, 07:54:05 pm ---UTF-8 is ascii so it shouldn't have any problems with it.
Now when you bump up to UTF-32 or UTF-16 that's when things get interesting.



--- End quote ---

Well utf-8 is not ascii. it is ascii compatible in some way.
And i think i just need to swith to GCC. that assumes utf-8 input by default.
Thanks

Leviathan:
ascii is a subset of utf-8. All symbols in ascii are present in utf8 with the same value.
But utf-8 is a multi-byte encoding, so a symbol may consist of 1, 2 or even 4 bytes, so obviously ascii doesn't contain all symbols utf-8 does.

Now, more on topic: What "Der Meister" said is absolutely correct. Windows uses utf-16 internally, therefore its support for utf8 is limited. Also, it expects a BOM (Byte order mark) at the beginning of a unicode-file. All other textfiles are assumed to be (extended) ascii.
Unix on the other hand doesn't expect a BOM, so the first 2 bytes are interpreted as symbols.

You have 2 choices: Either stick to ASCII like "Der Meister" suggested, or write a (very simple) program to quickly add or remove the Signature (0xFEBBBF) to/from files.

mandrav:
You should try with r1648. This should be fixed (for now).

killerbot:
?? why for now ??

otherwise this bug can be closed.

http://sourceforge.net/tracker/index.php?func=detail&aid=1384513&group_id=126998&atid=707416

mandrav:

--- Quote from: killerbot on January 02, 2006, 06:55:56 pm ---?? why for now ??

otherwise this bug can be closed.

http://sourceforge.net/tracker/index.php?func=detail&aid=1384513&group_id=126998&atid=707416

--- End quote ---

Well, because I didn't add any code to handle "strange" encodings, I just asked not to do any conversion on the charset. I believe it is fixed now, but I 'll wait a while until more people have tested it.

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version