Author Topic: No OEM charset support (a proposed solution)  (Read 2931 times)

Offline alkisg

  • Single posting newcomer
  • *
  • Posts: 3
No OEM charset support (a proposed solution)
« on: January 06, 2008, 02:54:43 pm »
Hi,

There is a "Default encoding when opening files" combo box in options, with which a user can select custom file encodings. It's a great feature, especially useful for windows console programs.

Unfortunately, it lacks many charsets, such as OEM Greek (cp737), OEM Cyrillic (cp866) etc.

One way to support all OEM charsets on Windows (which is where OEM charsets are the problem) would be:
  • To include a "OEM" codepage in the aforementioned combo box, which would map to the user's DOS encoding, whatever that would be (depends on the windows language).
  • To make a function bool OEMToANSI(char *src, char *dest); to be used when loading a file.
    This is easy on Windows, you just have to call MultiByteToWideChar() with CP_OEM as a parameter, and then WideCharToMultiByte() with CP_ANSI as a parameter.
    bool => returns false if not all the characters can be mapped to the new codepage.
  • To make a function bool ANSIToOEM(char *src, char *dest); to be used when saving a file.
    Again, you just have to call WideCharToMultiByte() with CP_ANSI as a parameter and then MultiByteToWideChar() with CP_OEM as a parameter.
  • To integrade these two functions to the existing open/save file logic, maybe with a #ifdef windows... to make it platform specific (since AFAIK Linux doesn't use different console charsets).

I haven't looked at the source code, but I think it'll be easy enough, codeblocks already has support for different charsets and when a file cannot be saved in the current encoding, it already warns the user and automatically uses utf8.

I can implement these functions for you if you like and post them here, just tell me if the prototypes are OK.
(btw I think wxScintilla uses utf8 internally, so maybe CP_ANSI should be replaced by CP_UTF8).

Finally, OEM charset should be automatically selected for console projects.

Thanks for your great IDE,
Alkis


P.S. existing workarounds for this problem that ... suck:
1) Call SetConsoleOutputCP() at program start:
=> is makes the code not portable
=> it requires users to change the console fonts to truetype
=> it doesn't work in real DOS or in full screen dos box.

2) Use gcc parameters -finput-charset and -fexec-charset:
=> they don't work with all compilers, not even with all gcc versions.

3) Use setlocale():
Same problems with (1).
« Last Edit: January 06, 2008, 03:05:04 pm by alkisg »

Offline JGM

  • Lives here!
  • ****
  • Posts: 518
  • Got to practice :)
Re: No OEM charset support (a proposed solution)
« Reply #1 on: January 06, 2008, 06:02:27 pm »
If my mind is correct back in time, visual studio 6 have the same problem with console programs. it doesn't displayed correctly characters like "áéíóúñ". That would improve Code Blocks over other IDE's I think. I write spanish applications and I have to mess with the compiler options, that solution sounds really nice.

SuperSailorMoon

  • Guest
Re: No OEM charset support (a proposed solution)
« Reply #2 on: January 07, 2008, 01:53:44 pm »
I support the suggestion of alkisg. Leed me to it!

Offline alkisg

  • Single posting newcomer
  • *
  • Posts: 3
Re: No OEM charset support (a proposed solution)
« Reply #3 on: January 10, 2008, 07:21:15 am »
If there is no interest in this from the developers, I guess we can also try to add the encoding we care about directly to wxWidgets:

How to add a new font encoding to wxWidgets:
http://fresh.t-systems-sfr.com/linux/misc/wxGTK-2.8.7.tar.gz:a/wxGTK-2.8.7/docs/tech/tn0018.txt