@koso
I'm not familiar with the customized hash function, but I'm familiar with the situation when we need a macro check. So, please correct me if I'm wrong.
Then only time a wxString will be checked in Tokenizer class is the case like below:
So if I understand it good, you are checking all words in source code (which can potentially be macro) for replacement -> so there will be many checks, but only small amount of them will lead to success.
More interesting question is, it the list/set of macros beeing replaced is constant, or it is dependent on user config? (Is this related to CC plugin configuration, where user can add custom replacement rules?). First case would simplify many things, but second will make it little complicated => you won`t be able to add there some "super-fast" filter functions, which will eliminate most false checks. (example of this is check for "_" at the begining -> it is very fast, but once user can add custom macros, it will be probably even slower than without it).
Little theory:
1. searching in <map> is based on string comparision in R-B trees. Count of operation depends on height of tree, which will be soomething like log
2 (number of macros) - in your case not more than 5 comparisions of wxString.
2. <unordered_map> with theoreticaly good hash function will run hasher function, and then 0 or 1 string comparission.
So if hash function is faster than aprox. 3 or 4 wxString comparision, you should get better results with hashing. This looks like no problem, but effectivity of string comparision is depedent on string length -> if you are checking many short words, it will be realy hard to design better hash function. And also, functions presented in this topic where more faster than good hashing function, so <unordered_map> will have to use more comparision operations...