Author Topic: TEXT macro and International text problem  (Read 5984 times)

mauser

  • Guest
TEXT macro and International text problem
« on: December 26, 2005, 09:09:11 am »
Hi, I have the following error message

Compiling: main.c
main.c:67:1: converting to execution character set: Illegal byte sequence
main.c:67:1: converting to execution character set: Illegal byte sequence
Process terminated with status 1 (0 minutes, 0 seconds)

while compiling this portion of code:

    if(!RegisterClassEx(&WndClassEx))
    {
        MessageBox(0, TEXT("Ошибка в регистрации !"), TEXT("Ошибка!"),
                   MB_ICONEXCLAMATION | MB_OK);
        return -1;
    }

The text inside the TEXT macro is in RUSSIAN, maybe the same broblem exists with other non-English languages.

I changed the text inside the TEXT macro to English and everything went OK. The _UNICODE and UNICODE prep. variables are set.
Also I tried to compile it with GCC and MSVC'2003 and got the same error. I think maybe it is kind of bug in C::B. Have anybody met such a problem with TEXT macro?

I use the full installation of C::B 1rc2 on WINXP PRO SP1 box.

Thanks in advance...

Offline thomas

  • Administrator
  • Lives here!
  • *****
  • Posts: 3979
Re: TEXT macro and International text problem
« Reply #1 on: December 26, 2005, 01:17:17 pm »
The RC2 version of Code::Blocks is ANSI, not Unicode. I am surprised that you are able to edit Russian text with that version at all. Unicode support is still not perfect and being worked on in the development version.

Regarding the TEXT macro, there is no reason whatsoever how a bug in Code::Blocks could be interfering with that.
"We should forget about small efficiencies, say about 97% of the time: Premature quotation is the root of public humiliation."

mauser

  • Guest
Re: TEXT macro and International text problem
« Reply #2 on: December 26, 2005, 05:35:28 pm »
Thanks. I see. I don't know how do the C::B Internals work, so I just thought that it may be type of byte-order bug of parser. Sorry for mentioning it as a bug :)

Offline thomas

  • Administrator
  • Lives here!
  • *****
  • Posts: 3979
Re: TEXT macro and International text problem
« Reply #3 on: December 26, 2005, 06:57:21 pm »
What the TEXT() macro does is nothing more than prepending L to its argument (exactly like _T() or __T()).
Thus, the preprocessor replaces TEXT("evil russian characters") into L"not-so-evil russian characters", so the compiler knows that it needs to use a larger (usually 16 bit, sometimes 32 bit) character size. The IDE is not involved in this at all (with the exception of the text editor, obviously).
"We should forget about small efficiencies, say about 97% of the time: Premature quotation is the root of public humiliation."

krysa

  • Guest
Re: TEXT macro and International text problem
« Reply #4 on: December 26, 2005, 11:37:32 pm »
Note that, most of the compilers requires you to have sources writen in ANSI. Some compilers supports unicode sources, but there might be some other requirements too. Before doing stuff like that, you should think about it very carefuly - the source will not be compilable on other platforms/compilers. If you're using the microsoft compiler (like free toolkit), than you can find more info here about all this stuff. I don't know about the other compilers.

Anyway, you can still have non-latin unicode characters inside string literals - you must use hex escape codes. I am from Lithuania, and i experienced this problem with the non-latin characters too. The best source for all characters is, ofcourse, unicode.org page. You can download all the character tables in PDF format. Heres the PDF file for Cyrillic characters. So, this code should work for you:
Code
if(!RegisterClassEx(&WndClassEx))
{
#ifdef UNICODE
    MessageBoxW (0, L"\x041E\x0448\x0438\x0431\x043A\x0430 \x0432 \x0440\x0435\x0433\x0438\x0441\x0442\x0440\x0430\x0446\x0438\x0438 !", L"\x041E\x0448\x0438\x0431\x043A\x0430!", MB_ICONEXCLAMATION | MB_OK); // <= Unicode character codes can be used here
#else
    MessageBoxA (0, "Error in registration !", "Error!", MB_ICONEXCLAMATION | MB_OK); // <= non ANSI characters won't work here
#endif
    return -1;
}

Offline thomas

  • Administrator
  • Lives here!
  • *****
  • Posts: 3979
Re: TEXT macro and International text problem
« Reply #5 on: December 27, 2005, 09:31:01 am »
Note that, most of the compilers requires you to have sources writen in ANSI. Some compilers supports unicode sources, but there might be some other requirements too.

gcc accepts UTF-8 by default, plus, you can set an arbitrary charset: http://gcc.gnu.org/onlinedocs/gcc-3.4.4/gcc/Preprocessor-Options.html#index-fexec_002dcharset-510
"We should forget about small efficiencies, say about 97% of the time: Premature quotation is the root of public humiliation."