Hi guys. I was wondering of something... Recently I've been using a lot the search in files feature, and I realized that perhaps things could speed up a bit if we maintained a "global dictionary of tokens" and keep a list of tokens per file (this list could possibly be updated on file save). Search in files would tokenize the search string and find which files had all of the tokens, and to refine the search from there.
So, it looks like you want to implement a text search. Not reg search. right?
The dictionary could contain some thing like:
keyword(string) -> [file index(int), offset in the file(int)]
That's all.
About the tokenizer, the QUEX could be a big candidate. It is extremely fast. It natively support output the "offset in the file" characters. Also, we can also record the "line" and "column" information.
The dictionary is mostly like the tokenstree in CodeCompletion plugin. As I think you are quite familiar with it. A self made Patricia tree or some database like SQLite.
Another metadata Idea would be an expansion of the todo concept, and I don't know if it could be implemented. How about adding "notes" per file, so that we could have more thorough comments (maybe even including graphics in later versions)? So, instead of having a comment like // TODO, we could have //EXTNOTE:45, and if we hovered the mouse over that line, a "hint" would popup displaying the notes file.
Currently, if you use doxygen style comments, I think we can add something like
The doxygen already support put a link of images/latex style formula in the comment, so we can only interpret that special command, and show the image when the mouse hover it.