I enjoyed reading what you had to say. It was particularly well written and well thought out. I had put some more thought into the problem, and wrote a few pages on it. Here's more or less what I've written. It's lacking in some places, but I'm still working on it:
Having put yet more thought into this(what can I say? I'm bored), I've come up with a list of what the structures should look like:
Namespace(of which the global namespace is a special case of):
- list of all functions
- list of all variables
- list of all enumerations
- list of all typedefs
- list of all classes
- list of all namespaces (note that children of these can be moved up by using the using keyword)
- list of all preprocessor definitions (all defs go in the global object)
Class:
- list of base classes
- list of variables
- list of enumerations
- list of typedefs
- list of methods
- list of static variables
- list of static methods
- list of classes(always static)
Now we just need to pick what the best container would be for the lists, and write the code to merge these lists into the main one, which would differ in that instead of actually holding content, it would hold pointers to said content.
There are a few more questions, of course (along with my own answers):
* What sort of information should be held about the various identifiers?
- type for variables
- fuction signature for functions/methods
- the other end of a typedef
- Classes should check for inheritance
- Classes should check for storage class (private, protected, or public), and store that info for all members
- line number of declaration/prototype/implementation
* What will the processor time be for all of this?
* What big will the memory footprint be?
* Are we missing anything?
- yes, but I can't put my finger on it...
* Are there any easily-foreseen problems/difficulties?
- Typedef walking (checking for members of a typedeffed object)
- Array support
- Parameterized preprocessor definitions
- unnamed namespaces. Sometimes I want to strangle the ANSI/ISO committee.
In addition, I just looked at the Visual Assist X website. They have some really good ideas there. So good, in fact, that I'd like to change number 2 from the main procedure to this:
2. Reduce the list to the most likely/probable solutions to the current identifier
I've written out more or less what we'll need to implement the structures. I figure there's one thing in common with all the elements, and since C++ lends itself to this so well, we should have a base class for all identifiers, which all of the various types can be derived classes of:
1. Identifiers (the base class)
class identifier
{
string name; // the identifier name
int decl_line; // line number of declaration (prototype for functions)
virtual string tooltip(void) = 0; // returns what a tooltip should display for the identifier
virtual string listname(void) = 0; // returns what the list name should look like
};
2. Variables
As much as it's tempting to add all sorts of flags about storage classes and modifiers, remember that when you do that, you increase the storage space by o(n). If we store it in a different list, you increase the storage space by o(1). Besides, the string is a descriptor, nothing more.
class variable : public identifier
{
string type; // the type of the variable
};
3. Enumerations
class enumeration : public identifier
{
};
4. Typedef
Yes, I'm aware typedef is a keyword. I don't particularly care (it's not real code)
class typedef : public identifier
{
string type; // the base type
};
5. Method/Function
class function : public identifier
{
int impl_line; // line number of the definition
string returns; // the return type
string signature; // The parameter list
};
6. Preprocessor defs
I've got absolutely no clue how to handle parameterized macros
class preprocdef : public identifier
{
string macro; // the other side of the macro
};
7. Namespaces
class namespace : public identifier
{
list variables;
list enumerations;
list typedefs;
list functions;
list classes;
list namespaces;
void using(identifer); // to support the using keyword(it brings something into the current namespace)
};
8. Classes
There's a slight problem here. Classes can be split into 6 sections: static/nonstatic, and then public, protected, and private. And they're all relevant.
class class : public identifier
{
list base_classes;
list variables;
list enumerations;
list typedefs;
list functions;
list classes;
list namespaces;
list static_variables;
list static_enumerations;
list static_typedefs;
list static_functions;
list static_classes;
list static_namespaces;
}
9. File
We only really want to cache external linkage(because the internal linkage of a file changes too quickly when we write code, and I'm not interested in telling the parser to reparse after every character typed.
class file
{
string filepath; // the filename + path (to open it quickly for reference use)
namespace global; // the global namespace
}
I'll check my books if there's anything I missed language-wise. So far the only things missing/un-supported are unnamed namespaces and parameterized macros.