May be lexers could be handle as file associations are. Depending on which kind of file a user open, C::B loads the relative lexer automatically [...]
I have been trying to implement just that last evening, but it is not as easy as you think. First, you don't know what a lexer refers to without loading it. Thus, you would have to encode this information somewhere. Keeping around an extra map file for this would work best, but then you are building up a dependency which is not good. When adding a new lexer, you have to update the map, or it won't work.
One could think about putting the extension which is handled into the lexer's name, but most lexers handle several (up to 6) file types, so filenames would become quite cluttered (still possible).
By default C::B has pre-defined associations (stored into the C::B config file?).
Hardcoded at the present time. We discussed this in January when restructuring the file association code, but decided to leave it hardcoded for now to not further complicate things.
Each type of file has its lexer. The user has the possibility to modify this list by either adding a new lexer and its relative file type association and/or to modify an existing one. User specific lexers could be stored separately into the C::B config file (or an alternative lexer config file).
That's basically how it used to be in the dark ages when all lexers were copied to the configuration. Currently, only differences are stored to the config.
My current plan is to scan the lexer folder once and load all lexers once. That provides us with a mapping of extensions to lexers which can be saved in the config file. On subsequent loads, Code::Blocks will know which lexer to load when opening a specific file type, and that can indeed be done on request then. When installing a new lexer, one would have to hit the "refresh button" to force reloading the map. That way, you don't need to configure anything, which is a good thing. I am still looking for a weak spot in this approach, but I guess it might just work fine.
What do you think about this approach?
TinyXML parsing is slow, with a rough measure of 200ms each lexer. [...]
SciTE loads way more lexers than C::B. The SciTE lexers also have more features.
You're comparing apples and oranges again. SciTE lexers have a collection of single line key/value pairs, and Code::Blocks lexers are xml documents that are validated for well-formedness. Of course it takes time to validate a document, this is not surprising.
The same goes for your network load story. You're missing the point here, too.
We are making on the order of 13,000 isolated file accesses during a "normal" startup. On a local file system, much of this can be cached, but it is absolutely not surprising that this is a major performance bottleneck over a network.
wxWidgets makes on the order of 10,000 distinct file accesses alone to load the XRC files. You can easily verify this using FileMon if you have any doubts about it.
To get back to TinyXML which is so terribly slow: the configuration file loads with about 6-7 file accesses, and all lexers are loaded using about 100 distinct file accesses. The time that TinyXML takes to parse those files is just ridiculous compared to the network latency of 10k accesses...