Suggestion: Using ctags & sqlite for code completion

Developer forums (C::B DEVELOPMENT STRICTLY!) > CodeCompletion redesign

<< < (2/16) > >>

takeshimiya:

--- Quote from: thomas on August 21, 2006, 09:07:48 pm ---Personally, I don't think that using SQLite is a good idea because I believe that parsing the SQL and converting data to and from the database's storage format may cause noticeable overhead, but I may as well be wrong. Prove me wrong! I'll be happy if you do :)

--- End quote ---
You'll be happy, Eran has already proved that SQLite is very fast for the purpose (you can download his IDE).
The c++ parser of ddiego also uses SQLite and Ctags, and he said it was very fast, the code parsing is the bottleneck

--- Quote from: thomas on August 21, 2006, 09:07:48 pm ---I think having two approaches at hand is not a bad thing at all. Your approach may be a lot more flexible to support other languages, too.

--- End quote ---
100% true.

takeshimiya:
I'll post here the interesting info from http://vcfbuilder.org/?q=node/143 since it requieres registration:
(The replies are from ddiego)

--- Quote ---Something very simmilar to your purpose have been done here: http://forums.codeblocks.org/index.php?topic=1889.0
You really want to read that thread.

However, I didn't have it clear: you will be discarding ANTLR at all, and only using CTags?
Or will use ANTLR to generate the database?

--- End quote ---

Interesting thread. I still see the need for only 2 "parsers".

The ctags parser is used to create the persistent DB which will exist in various places just like it Visual Studio does with it's .ncb files. The question becomes how often these db files become updated. For system include directories this would be a one time cost, the db is made once, and then not messed with unless the system includes dir changes or the system include files changes.

The db for the project would be more volatile, that would have to be changed more often, but given the speed of ctags and sqlite I don't see this as a problem.

The difference between the DB data and the parser data, is that the DB data would be more sparse. But the result of using either one would be the creation of an AST that is a graph of CodeNode instances that can be traversed. So if you parse a single file with the ANTLR based C++ parser, or request some data from the ctags based DB, both will return this information as a collection of CodeNode's.

...

I am (I'm already coding this right now) using both. The idea is to use ctags to create a DB that has a broad overview of the various AST elements, but use ANTLR to provide an exact view of a specific file/resource.

...

In addition to my earlier comments here are some thoughts on where I'd like to see this whole thing going:

Currently, relying we are relying soley on the C++ parser to handle ALL of the parsing chores. To parse a single file in "real time" (about as fast as you can type), it works OK, but to potentially have to have it parseing thousands of files to keep track of all the possible headers in your project, plus system (and other third party headers) files seems unwieldy - it just won't handle this fast enough.

So this got me thinking about how all of this (the parser and the CodeStore "engine") should work. After taking a glance at how Visual Studio seems to do things I've come to some conclusions:

* First, an simplify things by creating a database of all the core elements that we need to display in our class AST. This set of elements is a subset of the entire AST for any given file. We care about things like function declarations, function arguments, templates, template arguments, class declarations, namespace declarations. Putting these into a database makes it easy to search, and provides more potential flexibility for search types.
* If we have a database of this data, then it makes sense to support more than one. There would be one db per project, and then one (or more) "global" db's for system headers (like the C runtime, or the C++ STL). The global db's would only have to be generated once, since these won't change often (if at all).
* We would need a schema for the db, a table that has the following columns:
o id INTEGER PRIMARY KEY
o name TEXT
o filename TEXT
o line INTEGER,
o kind INTEGER,
o language INTEGER,
o access INTEGER
o inheritance TEXT,
o parent INTEGER,
o signature TEXT

This schema would allow for generating a hierarchical display if neccesary

* To generate these databases, we don't need the full fledged support of the C++ parser, since we need only a limited number of AST nodes, at this level. So what what I'm thinking is to use ctags to generate the initial db info, then use SQLite3.2 to create/store the ctags data into a db. This would accomplish most of what we need, then use the parser for those cases where the entire AST is needed. Using ctags, and SQLite, I can create a DB representation of the entire VC98/Includes directory from scratch in about 1 minute or less (that's about 726,773 lines of code scanned). And this would only have to be done once.

All of the above would be done transparently by the CodeStore engine. SQLite source would become incorporated, and ctags would be used as an exe (we can't use it directly as a library due to GPL issues).

----

You can check out the project by doing:
svn co https://svn.sourceforge.net/svnroot/classdom classdom

eranif:
Hi Takeshi,

All what you have written here from the thread - I already implemented (the database described by Diego - is almost identical to my...)

I uploaded to my site a compressed zip of my current work including all sources. I reached the part of the auto-completion - all the infrastructure functions are ready - I just need to put the together.

In the zip you can find:
Visual studio workspace 7.1
Three projects: CodeParser, CodeParserTest & CodeParserGUISample
sqlite3.dll

Compile the workspace and run the GUI sample - it is a very easy to use.

When running first time: use the option: Add source to database
and follow the instructions

double click on an item on the GUI tree to the left, will open it in an editor to the right
and will place the cursor on the correct line.

All the logic and flow are located on the frame.cpp file

to make sure it will run, copy ctags.exe and put it under C:\windows\system32

Link to the source files:
www.eistware.com/wxes/codeparser/codeparser.zip

Link to ctags.exe for windows:
www.eistware.com/wxes/codeparser/ctags.zip

Btw, I too once thought of using real parser for IDE, but I abandoned this idea since true parsers will throw exceptions when syntax is incorrect, so we need more of a guessing system

for example:

When you write:
CBlock block;

as a coder, you automatically assume that CBlock is a class or something like this, but real parser, if it will not find the declaration for it, it will fail.
so you need more tolerant parser.

Anyways, I believe I will complete my work during next week.

Eran

takeshimiya:

--- Quote from: eranif on August 21, 2006, 10:24:02 pm ---Btw, I too once thought of using real parser for IDE, but I abandoned this idea since true parsers will throw exceptions when syntax is incorrect, so we need more of a guessing system

--- End quote ---
You're talking about Compiler parsers, which are designed for being very correct and to fail at the first incorrect syntax.
The ANTLR c++ generated parser does not, however, because it is designed to be extremelly correct, but as it's a generated parser you can control what to do at failing times, and how to generate the AST, etc.

Quoting ddiego which have read your previous thread: "I still see the need for only 2 parsers.
The idea is to use ctags to create a DB that has a broad overview of the various AST elements, but use ANTLR to provide an exact view of a specific file/resource."

The idea of having also an exact view of a specific file also becomes evident when we'll want to use the parser for Refactoring. In that moment, we'll have to use the "exact view".

So that hybrid approach seems to be the best solution.

--- Quote from: eranif on August 21, 2006, 10:24:02 pm ---Anyways, I believe I will complete my work during next week.

--- End quote ---
Thank you for your efforts, really looking forward to it!

Regards,
Takeshi Miya

MortenMacFly:

--- Quote from: eranif on August 21, 2006, 10:24:02 pm ---In the zip you can find:
Visual studio workspace 7.1

--- End quote ---
I read this and would like to have a look into it. Unfortunately I don't get it compiled. Besides the fact I have no VC7.1 I tried converting this into a C::B project but... failed! :shock:
Eran: Do you see any chance to provid me (us) with a C::B project file that e.g. uses the wxWidgets libs as they are produces for C::B (please look at: http://wiki.codeblocks.org/index.php?title=Installing_Code::Blocks_from_source_on_Windows#Building)? I ask because you may have this already - it may be not much work for you...?!
With regards, Morten.

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version