Author Topic: How to set/get information of encoding of compiled source files in project ?  (Read 7920 times)

Luke Matuszewski

  • Guest
I was thinking how to get information of encoding of source files from my project...We all know that source files eg for C++ or C should be written in default encoding of operating system so compiler will properly "decode" the code written and thus:
- for linux in Poland default encoding is ISO 8859-2 (or when set UTF-8/other);
- for windows the default encoding is Windows 1250 in Poland (in USA it is probably Windows 1252);
But we all know that first 127 characters in these encodings (in Windows 125x and ISO 8859 and UTF-8) are the same, and thus all keywords in standard languages are properly read...but i want to ask:

1. What must i do to write source files in my project and use Unicode encoding(and which unicode encoding i should use in wxWidgets - UTF-16 ?) ? I ask becouse i would like to put some constant strings in my source code which will be encoded in unicode especially with in wxWidgets using wxString (_T() macro or wxT() macro or even _() macro) here is example

if ( string should be translated )
      use _("string")
   else if ( string should be in Unicode in Unicode build )
      use wxT("string")
   else
      just use "string" normally
// wxT()/_T() adds only L literal to string so it is trated as wide characters.
So in wxWidgets i have these macros that translates my strings (_() macro) if it should be translated (it will be translated in nonUnicode build and NOT translated in Unicode build - when i configure wxWidgets to use unicode build).
My question is what unicode encoding i should use whe writing wxWidgets project (if i should) and how those it come to play with Code Blocks (does code blocks supports editing files in unicode encoding and which of them).

2. What about C++ compilers and support for unicode encoded source files ? I ask about these becouse only in Visual Studio i can choose the encoding of the file so compiler will properly decode the source file contents...

I assume here that code blocks writes source code files in default encoding of operating system so writing in that encoding is supported... but what about other encoding eg. project written for multiplatforms... with unicode this problem will be handled since unicode UTF-16 supports wide spread of languages (even those which are dead)

(Also UTF-8 is totally compatible with ASCII).

I have read also that spec says that C/C++ code should be written in basic character set similar (but more restricted) to ASCII, but what about wide characters and L putted before/after strings in C/C++ code like this:
... = "someCode"L;

One way to use unicode strings is to use in string the \uXXXX, but it is completely unreadable... So how to write strings(or to be more corect char arrays) that will be human readable (eg. by Japanise people in japanise) and will properly be "compiled".

Help and answers are appreciated :).
Luke Matuszewski from Poland.

Offline rickg22

  • Lives here!
  • ****
  • Posts: 2283
You can take a look at the developers' forum, there's a full unicode conversion thread.

wxT and _() do nothing if compiled in ansi mode. This ensures data consistency.