User forums > Help

ISO-8859-1 detection problems - HELP!

<< < (2/2)

thomas:
I've been wondering for quite some time now whether we should not just say "everyone uses UTF-8 and fuck the rest". There is so much pain involved in encodings, it never seems to work right for anyone (including me), and seeing how it is truly a harsh task for the machine to guess right in many cases, it probably won't ever work either.
So basically, one could still import whatever encodings there may be, but only ever create/edit/save in UTF-8.

Yes, I'm aware that this will not work for people who are on projects with people who don't get their editors right, in the same way as exclusively using tabs (which I still favour as an idea) will cause trouble for people in projects with others who don't know the difference between the tab key and the space bar or people who use text editors from 1976.
One could at least consider making such a behaviour configurable (and, if it went after me, enabled by default).

UTF-8 admittedly does not "truly work" for anyone but native English (insofar as it needs escape characters), but on the other hand it works surprisingly well with very little overhead for 90% of the world, and it works in a still acceptable manner for the remaining 10%. It's well-supported from the compiler side too[1].

Sure enough, if you write your thesis in traditional Chinese, then UTF-8 will not truly be the best possible pick, but if you use Code::Blocks for that, you're using the wrong tool, too. On the other hand, source code is still "mostly ANSI" even if you're Chinese, so the impact is not truly that bad.

[1]In fact, speaking of encoding, did anyone ever wonder if it's necessary to use -finput-charset in accordance with each file's encoding, since gcc assumes UTF-8 otherwise? I've never noticed anything since I rarely ever have a non-English character in a source and use UTF-8 anyway, but technically, we're compiling all sources wrong by default...

Jenna:

--- Quote from: MortenMacFly on August 20, 2011, 10:04:17 am ---
--- Quote from: jens on August 20, 2011, 09:07:52 am ---But before working on it, I would like to see the opinion of other devs and users.

--- End quote ---
I wonder how this is handled in other IDE's (CodeLite/VS for example). Does anyone know?
I don't recall that I've ever seen such kind of flags in project files, so there might be a "smarter" way.
Rick: How would you do this in VS?

--- End quote ---
Codelite opens ricks example correct, but a chinese (not utf-8) text not recognized.
Such a flag would be "hidden" in the file's properties, so most users would not be bothered normally, but in some special cases (as ricks example) the encoding can be forced without breaking everything else.

rickg22:

--- Quote from: thomas on August 20, 2011, 11:11:43 am ---I've been wondering for quite some time now whether we should not just say "everyone uses UTF-8 and fuck the rest". There is so much pain involved in encodings, it never seems to work right for anyone (including me), and seeing how it is truly a harsh task for the machine to guess right in many cases, it probably won't ever work either.
So basically, one could still import whatever encodings there may be, but only ever create/edit/save in UTF-8.
--- End quote ---

The problem is that this is a non-portable solution. Basically, the project I'm working on is an ASP project (IIS) on a windows machine. As much as I wished every platform to support UTF-8 natively, I think Microsoft IIS will be a viable platform for at least a decade. And, as we know, IIS defecates on UTF-8.

eranif:

--- Quote ---I wonder how this is handled in other IDE's (CodeLite/VS for example). Does anyone know?
--- End quote ---
Well, codelite does not do anything special. By default all files are opened in ISO-8859-1. User can set the encoding in the IDE level (NOT per file)

The reason I choose to use ISO-8859-1 as the default and NOT UTF-8 is because saving a file with UTF8 encoding is like x10 slower under Linux.

The only thing "smart" that codelite does is handling BOM correctly.
Eran

Navigation

[0] Message Index

[*] Previous page

Go to full version