toupper and
tolower are used because they offer an easy way to distinguish between nodes and leaves in the configuration, and the configuration can be accessed the fastest possible way.
You can store your configuration for example like this:
<path name="foo">
<path name="bar">
<key name="x" value="y" />
</path>
</path>In fact, most projects use XML in this "standard" way, too, as the structure is clear, you can design a DTD, most every web browser will display the file just fine, and it is easy to edit nodes by hand.
However, this is not desirable at all. Many people see the fact that something uses XML as a clear invitation to regularly edit the file in Notepad. We had many false bug reports and troubleshooting issues in the past because someone edited project files by hand and forgot a closing tag. That is unnecessary grief which would not happen if people did not say "why, I
can edit it, so it is meant to be edited". If we were using SQLite or Berkeley DB as a storage backend, nobody would ever think about this.
Remember, we are not writing a hypertext document or something else that needs to be displayed in a browser, nor anything a human neads to read or understand at all. We don't care about DTDs or anthing of that matter. All we need is a structured, flexible data storage.
Another major issue is speed. The configuration is accessed many thousand times (sometimes 50-60 times per second), so it can become a major bottleneck if care is not taken.
Following the above "standard" scheme, you have to iterate recursively through the path to find the route to a key, each time asking the XML engine for a node pointer, and compare its name attribute. Also, you have to visit each and every key node in a subpath sequentially and compare its name.
On the other hand, what if a path node had no name attribute, but a path node
were its name? Then you could just ask the XML engine for the first child node of type "name", no need to iterate anything. The same would work for keys, but you need a way to somehow distinguish keys and path nodes. That would turn O(n) into O(1) for accessing a value
1.
This lead to the scheme used in the Code::Blocks configuration. Path nodes are lowercase, keys are uppercase, and the tag is the node's/key's name:
<foo>
<bar>
<X value="y" />
</bar>
</foo>Although this looks weird and clearly makes things like a DTD impossible, it is nevertheless valid, and by order of magnitude faster than the other solution. Also, it allows for a couple of other good things.
Of course, for this to work, only pathnames and keys that constitute valid xml tags can be used, but that is normally not a problem. For practical reasons (all devs have to be able to read the names) we only use English names, anyway.
The problem with the Turkish locale, in my opinion, is that it does soemthing that is actually not right. It silently transforms ANSI characters to "strange Unicode chars" one way.
Very well, you could still argue "but that is how we use them in Turkish". However, it does not do the backwards conversion "correctly" ("correctly" means "non-Turkish" :lol:), and that is really bad.
toupper(tolower(x)) does not give you the same as
toupper(x)! :shock:
I mean, maybe that is really how it
should be, but to me it seems quite wrong. It is like
(5 + 3) - 3 != 5.
Anyway, let's not get philosophic on how localisation should be :lol:
Both
toupper and
tolower are now custom locale-unaware functions, so that problem should be solved
1 In reality, things are a lot more complicated. tinyXML indeed does a linear search internally, so we do actually have a complexity of O(n). However, we would have to add another linear search on top of that, giving us a total O(n²). So the correct figures would be O(n²) versus O(n). The principle is the same, I did not want to make it more complicated than necessary.
However, if tinyXML were optimized one day to do a map lookup instead of a linear search, we would have O(n²) versus O(log(n)) which would really be *a lot* more favourable.