Code::Blocks Forums
User forums => Help => Topic started by: mauser on January 01, 2006, 07:32:29 pm
-
I have the unicode version of revision 1635. i write win32 apps and it seems that the source codes are utf-8 encoded, but MSVC2003 seems not to understand utf-8 source codes, is it a c::b feature? what can you advice. Thanks in advance
-
Also C::B didn't show anything when i tried to open a file with russian characters in cp1251 charset editied in another editor. It behaved the way if the file was just empty.
-
MSVC2003 seems not to understand utf-8 source codes, is it a c::b feature? what can you advice. Thanks in advance
What do you mean by "understand"? Is the problem by loading, displaying or compiling the sources?
Michael
-
UTF-8 is ascii so it shouldn't have any problems with it.
Now when you bump up to UTF-32 or UTF-16 that's when things get interesting.
-
MSVC seems to need a signature for Unicode-Source files. You can set this in the dialog "file->Extra save options" (or somthing similar to that). The required option is "UTF-8 with signature". If you save your file that way MSVC recognizes that it is a Unicode-Source-File and works with it without problems.
But: The signature consits of two or three bytes at the beginning of the file. Some editors show them (or at least some cryptic characters for them), some don't and some don't even open such a file. Code::Blocks opens it (at least it did at my last try) and seems to have no problems with that signature (it even doesn't show it). But the compilers I tested (gcc and icc - both on linux) refused to compile this file. They complain about invalid characters in the file. Unfortunately MSVC (the IDE as well as the compiler) seems to need this signature to properly handle Unicode-source-files.
The only solution I can give you here: Don't use Unicode-Source files if you want to use them with MSVC and other editors/compilers. In strings you can still use unicode-characters if you use their code instead that character itself, i.e. write '\x00E4' instead of 'รค'.
-
UTF-8 is ascii so it shouldn't have any problems with it.
Now when you bump up to UTF-32 or UTF-16 that's when things get interesting.
Well utf-8 is not ascii. it is ascii compatible in some way.
And i think i just need to swith to GCC. that assumes utf-8 input by default.
Thanks
-
ascii is a subset of utf-8. All symbols in ascii are present in utf8 with the same value.
But utf-8 is a multi-byte encoding, so a symbol may consist of 1, 2 or even 4 bytes, so obviously ascii doesn't contain all symbols utf-8 does.
Now, more on topic: What "Der Meister" said is absolutely correct. Windows uses utf-16 internally, therefore its support for utf8 is limited. Also, it expects a BOM (Byte order mark) at the beginning of a unicode-file. All other textfiles are assumed to be (extended) ascii.
Unix on the other hand doesn't expect a BOM, so the first 2 bytes are interpreted as symbols.
You have 2 choices: Either stick to ASCII like "Der Meister" suggested, or write a (very simple) program to quickly add or remove the Signature (0xFEBBBF) to/from files.
-
You should try with r1648. This should be fixed (for now).
-
?? why for now ??
otherwise this bug can be closed.
http://sourceforge.net/tracker/index.php?func=detail&aid=1384513&group_id=126998&atid=707416
-
?? why for now ??
otherwise this bug can be closed.
http://sourceforge.net/tracker/index.php?func=detail&aid=1384513&group_id=126998&atid=707416
Well, because I didn't add any code to handle "strange" encodings, I just asked not to do any conversion on the charset. I believe it is fixed now, but I 'll wait a while until more people have tested it.
-
I builded and tested on some as files and scintilla editor.cxx and for those it worked, let's hope there are no side effects. (winXP sp2 system)
-
it works for me now with the files, which previously didn't open.
we'll see what happens in future , don't care too much for now ...
-
BAD NEWS ;
I justed builded on linux (SUSE10) and it seems the problem still occurs there, tried it out on :
editor.cxx
as/* files
:-(
Lieven
-
BAD NEWS ;
I justed builded on linux (SUSE10) and it seems the problem still occurs there, tried it out on :
editor.cxx
as/* files
:-(
Lieven
It will be fixed now that I pinpointed the error :)
Just show some patience ;)
-
Just show some patience :wink:
Heh, you should consider to add something like that as your signature now :P
-
Just show some patience :wink:
Heh, you should consider to add something like that as your signature now :P
You know what? I just might ;)
-
Just show some patience :wink:
Heh, you should consider to add something like that as your signature now :P
I agree !!!
LOL
-
I just did, lol :D
-
don't forget :
It will be fixed now that I pinpointed the error :wink:
-
You know what? I just might ;)
(http://www.smiliemania.de/smilie132/00000215.gif)