Linux won't edit Jonsson umlauted name

Developer forums (C::B DEVELOPMENT STRICTLY!) > Development

(1/8) > >>

Pecan:
I cannot apply a patch, or gedit, or CodeBlocks edit a file with
Jonsson's umlauted name using ubuntu brezzy.

why?

gedit says it cannot determine the coding of as_config.h et.al.
CodeBlocks shows an empty file, but shows an utf-8 status.
vim doesn't give a damn.

I had to change (using vim) the umlauted 'o' to an english 'o' to edit the file.
Then the patches worked and everyone was happy.

what gives here?

thanks
pecan

EDIT: Mac couldn't apply the patch either, but when I edited
the files by hand, it changed the umlaut to a '^' and shows ISO-8859-1 in the status bar.

Pecan:
What encoding should I set for the Linux Codeblocks editor in order to edit the AngelScript files containing an umlauted 'o'?

It defaults to utf-8, but the editor shows all blanks after the umlauted 'o'.

thanks
pecan

thomas:
We don't have Codepage/Unicode detection, unluckily. It is a quite complicated matter.

The current version of Code::Blocks uses a hack to make things work for most people, and that is simply setting the encoding to "system default". Most of the time, that happens to be the correct encoding, and it works.

If you have an idea how to determine the document encoding efficiently (or maybe even own code to do that / know a free library), please step forward :)

We plan to implement something similar to how web browsers do automatic document encoding detection after 1.0, i.e. build a histogram over the input file and do a statistical matching.
I don't know whether this is terribly efficient, but it should be quite failsafe.

Any better idea?

mandrav:

--- Quote from: Pecan on May 17, 2006, 02:15:04 pm ---What encoding should I set for the Linux Codeblocks editor in order to edit the AngelScript files containing an umlauted 'o'?

It defaults to utf-8, but the editor shows all blanks after the umlauted 'o'.

thanks
pecan

--- End quote ---

I 've set it to iso8859-1 and I can edit the AngelScript files just fine...

takeshimiya:

--- Quote from: thomas on May 17, 2006, 02:32:54 pm ---If you have an idea how to determine the document encoding efficiently (or maybe even own code to do that / know a free library), please step forward :)

--- End quote ---

Hope this helps :)
http://www.mozilla.org/projects/intl/
http://www.mozilla.org/projects/intl/UniversalCharsetDetection.html
http://www.mozilla.org/projects/intl/chardet.html
http://www.mozilla.org/projects/intl/ChardetInterface.htm

and at last, "How to build standalone universal charset detector from Mozilla source":
http://www.mozilla.org/projects/intl/detectorsrc.html

Navigation

[0] Message Index

[#] Next page

Go to full version