Developer forums (C::B DEVELOPMENT STRICTLY!) > Development

Improving "search in files" with a word index? And other ideas with metadata

(1/3) > >>

rickg22:
Hi guys. I was wondering of something... Recently I've been using a lot the search in files feature, and I realized that perhaps things could speed up a bit if we maintained a "global dictionary of tokens" and keep a list of tokens per file (this list could possibly be updated on file save). Search in files would tokenize the search string and find which files had all of the tokens, and to refine the search from there.

Another idea that I had was to revamp the "TODO" plugin to use metadata for TODOS, including their dates,  file/lines and priorities. So when I open the project, I can see the latest TODOs that I have added without having to search in all the files. The latest todo would open the corresponding file whenever I open the project.

Another metadata Idea would be an expansion of the todo concept, and I don't know if it could be implemented. How about adding "notes" per file, so that we could have more thorough comments  (maybe even including graphics in later versions)? So, instead of having a comment like // TODO, we could have //EXTNOTE:45, and if we hovered the mouse over that line, a "hint" would popup displaying the notes file.

What do you think?

oBFusCATed:
Sounds great, but someone should implement it, would you?  :lol:

The string tokenization, sound pretty good. Last couple of days I'm wondering how VStudio, does find in files so fast and maybe they do something like this.

TODO notes is great idea, too, and sounds doable.

rickg22:
Okay, I'm going to have vacations soon. Perhaps I'll implement this.

oBFusCATed:
For the search it will be best to modify the ThreadSearch plugin, because it is way better than the normal "find in files"...

ollydbg:

--- Quote from: rickg22 on October 12, 2011, 06:06:28 pm ---Hi guys. I was wondering of something... Recently I've been using a lot the search in files feature, and I realized that perhaps things could speed up a bit if we maintained a "global dictionary of tokens" and keep a list of tokens per file (this list could possibly be updated on file save). Search in files would tokenize the search string and find which files had all of the tokens, and to refine the search from there.

--- End quote ---
So, it looks like you want to implement a text search. Not reg search. right?
The dictionary could contain some thing like:

--- Code: ---keyword(string) -> [file index(int), offset in the file(int)]

--- End code ---
That's all.
About the tokenizer, the QUEX could be a big candidate. It is extremely fast. It natively support output the "offset in the file" characters. Also, we can also record the "line" and "column" information.

The dictionary is mostly like the tokenstree in CodeCompletion plugin. As I think you are quite familiar with it.  A self made Patricia tree or some database like SQLite.


--- Quote ---Another metadata Idea would be an expansion of the todo concept, and I don't know if it could be implemented. How about adding "notes" per file, so that we could have more thorough comments  (maybe even including graphics in later versions)? So, instead of having a comment like // TODO, we could have //EXTNOTE:45, and if we hovered the mouse over that line, a "hint" would popup displaying the notes file.

--- End quote ---
Currently, if you use doxygen style comments, I think we can add something like

--- Code: ---@CBNOTE:45
--- End code ---
The doxygen already support put a link of images/latex style formula in the comment, so we can only interpret that special command, and show the image when the mouse hover it.

Navigation

[0] Message Index

[#] Next page

Go to full version