If I understood, the encoding is not saved into the file, so C::B tries to guess the right one when the file is opened. So when you add the latin-2 encoding, some latin-1 files are seen as latin-2, so some characters are displayed in a wrong way.
So I change the option Settings >> Editor >> General Settings >> Use This Encoding to "As default encoding", so the C::B's auto-detectionis bypassed, and this solved my problem.
If this is the situation, I think that your idea of the configuration-option is the easiest one: I'm afraid that detecting latin-1 from latin-2 is not easy...
A file encoding is never saved to a file. At least I'm unaware of any widely popular method of storing file encoding data to a file. Reason is pretty simple. One need to strip that data (encoding detection data) before feeding it to another program. Or the other program should be aware of how to strip that data.
Most of the known software performs encoding detection by sampling data from file. There are BOM for UTF encoded file. But for other encoding sampling and measuring frequency of encoded characters is the only way to detect an encoding. Mozilla's encoding detection is an example of this.
For a large project with files encoded with a less popular encoding scheme will surely degrade Code:Blocks' performance. Precisely this was the reason Yiannis objected (long ago) to the inclusion of Mozilla's encoding detection code to trunk. IMHO it should be turned on as an Option only.
IMHO the solution should be to give user two option.
1) Use a simple encoding detection scheme (to detect most of the popular encodings).
2) Use Mozilla's code to detect encoding (we should make it very clear that this option may affect performance in some cases).
3) Don't detect encoding.
And to each of the above options Fallback Encoding options shall be-
a) Use System encoding.
b) Use User-provided encoding.
Code:Blocks is an IDE & IMO we should try not to transform it to a Text Editor or a Browser.
You are right, C::B is not a browser, but C::B is an IDE, so the editor is an important part, otherwise it would be not much more than a build-system like makefiles.
I did not answer yesterday, because I wanted to make some speed tests.
I did not have a project, where he new encoding detection is noticeable slower than the old one.
This might be different for some less popular (rare?) encoding schemes,as you wrote.
But in my opinion it is better to have a somewhat slower encoding-detection that works, than a fast detection, where the user has to force something to make it work.
A possible slowdown, is only a problem, if a user opens (very) many files at once.
We had some real bottlenecks, but together with the new detection, more parts of the file-loading code were improved.
Nevertheless it still needs much more time to load the file into the editor (after it's encoding has been detected and it was loaded into a string-variable), than the encoding detection took (at least in the great majority of all cases I have seen).
For my tests I use the same code except for the detection, that means all speed-ups are the same in both test-variants of C::B.
The new encoding detection does the following:
- check for user-forced-encoding
- search bom
- try to detect uft16
- try to detect uft32
- try to detect ascii or extended ascci encoding like HZ-GB-2312, ISO-2022-CN
- or try to detect singlebyte, multibyte and latin1 charsets
The first four did not change with new encoding detection, the ascii-detection is very fast, the last might take a little longer, but I don't get a difference, even if I use the wxStopWatch and log the time needed to open all files of a project, there is no noticeable or measurable difference.
Bypassing C::B's detection should be faster (but we still have to convert the file-content, what takes "much" time), but that's a thing the user can still do.
If it works fast for the majority of cases and possibly slower for some cases (I did not get one until now) it does not make sense to change anything (except for not probing latin2 encodings by default, because it breaks detecting of some latin1 charsets).