Author Topic: Code completion doesnt follow #include in struct  (Read 31252 times)

Offline JGM

  • Lives here!
  • ****
  • Posts: 518
  • Got to practice :)
Re: Code completion doesnt follow #include in struct
« Reply #15 on: March 24, 2011, 06:14:00 pm »
So basically you created a definition file http://code.google.com/p/quexparser/source/browse/trunk/cppparser/cpp.qx and the quex library takes care of the rest?

mmm, i'm gonna read the manual http://voxel.dl.sourceforge.net/project/quex/HISTORY/Ancient/Documentation/quex-doc-09y08m01d.pdf seems to be a great project.

I'm sending you my email on a private message to stop hijacking this thread  :).

Offline Ceniza

  • Developer
  • Lives here!
  • *****
  • Posts: 1441
    • CenizaSOFT
Re: Code completion doesnt follow #include in struct
« Reply #16 on: March 24, 2011, 06:54:42 pm »
Why do you keep insisting in creating an overcomplicated const expression evaluator? Evaluating a const expression after macro expansion is straight forward using the grammar from the standard. I have pointed to an implementation multiple times, but it is always ignored. All you have to do is feed it the tokens while skipping whitespace (comments belong here too, if retained).

Proper macro expansion is complicated, though, even more when handling concatenation, "stringification" and recursive replacements. Think of something like this:

Code
#include <cstdio>

#define STR_HELPER(x) # x
#define STR(x) STR_HELPER(x)
#define CONCAT2(x, y) x ## y
#define CONCAT3(x, y, z) CONCAT2(x ## y, z)

int main()
{
    std::printf(STR($%@!&*));
    std::printf(STR(CONCAT3(this, /* COMMENT */ is, a /* another comment */ test)));
}

How many of you can actually tell me what the program is supposed to show on screen without compiling and running it? What if I replaced STR(x) to:

Code
#define STR(x) # x

Would it show the same?

Offline Ceniza

  • Developer
  • Lives here!
  • *****
  • Posts: 1441
    • CenizaSOFT
Re: Code completion doesnt follow #include in struct
« Reply #17 on: March 24, 2011, 08:56:43 pm »
There you go. The same code turned into something that should be easier to follow (it is, if you are not scared of pointers). All self contained, no templates. 4 tests included. Most of the conversion from the original code was made through find-and-replace. The grammar is included as documentation of each method.

Offline JGM

  • Lives here!
  • ****
  • Posts: 518
  • Got to practice :)
Re: Code completion doesnt follow #include in struct
« Reply #18 on: March 24, 2011, 09:17:59 pm »
So does this code works to evaluate macro expressions/conditions like these ones for example?

Code
#if VERBOSE >= 2
  print("trace message");
#endif

#if !defined(WIN32) || defined(__MINGW32__)
...
#endif

I'm not sure If I'm using the correct terminology (sorry for that I'm a dumb xD) It would be easier to talk you on spanish xD, theres many parsing terminology I'm not familiar with :(

Offline JGM

  • Lives here!
  • ****
  • Posts: 518
  • Got to practice :)
Re: Code completion doesnt follow #include in struct
« Reply #19 on: March 25, 2011, 02:55:16 am »
Well I documented and cleaned the code to some point here are the changes:

http://www.mediafire.com/?17skj2g70c86u50

Also I included a test case on the test folder, as my development environment is ubuntu/linux I created a shell script test/test.sh This script uses the debug binary of cpp_parser library and parses the test.hpp file also on test folder. (I took the example presented on this thread of #include on a typdef struct and made it part of the test case)

The original code on test.hpp is this one:

Code
#ifndef TEST_HPP
#define TEST_HPP

//Has multiply macro
#include "misc.h"

#ifndef MAX_VALUE
#define MAX_VALUE 10000
#endif

#ifndef MAX_VALUE
#define MAX_VALUE ShouldNotOccurre
#endif

typedef struct
{
//The content of the struct in another file
#include "object.h"
//Should only be included once
#include "object.h"
} Object;

namespace test
{
    int value = MAX_VALUE;
    int test = multiply;

    /**
     * Function one documentation
     * @return true otherwise false
     */
    bool function_one(const char &argument);

    /**
     * Function two documentation
     * @return The amount of characters found
     */
    unsigned int function_two(const char &argument);
};

//We undefine the MAX_VALUE macro
#undef MAX_VALUE

int value = MAX_VALUE;

#endif

and the cpp_parser binary returns this:

Code

 //Has multiply macro




typedef struct
{
  //The content of the struct in another file


unsigned value;
string test;


  //Should only be included once
} Object;

namespace test
{
    int value = 10000;
    int test = 4*5;
  
    /**
     * Function one documentation
     * @return true otherwise false
     */                                                                            
    bool function_one(const char &argument);

    /**
     * Function two documentation
     * @return The amount of characters found
     */                                                                                      
    unsigned int function_two(const char &argument);
};

 //We undefine the MAX_VALUE macro

int value = MAX_VALUE;

There are things left to implement and fix but for now basic functionality is working :D
« Last Edit: March 25, 2011, 03:00:51 am by JGM »

Offline Ceniza

  • Developer
  • Lives here!
  • *****
  • Posts: 1441
    • CenizaSOFT
Re: Code completion doesnt follow #include in struct
« Reply #20 on: March 25, 2011, 06:04:05 pm »
So does this code works to evaluate macro expressions/conditions like these ones for example?

Code
#if VERBOSE >= 2
  print("trace message");
#endif

#if !defined(WIN32) || defined(__MINGW32__)
...
#endif

I'm not sure If I'm using the correct terminology (sorry for that I'm a dumb xD) It would be easier to talk you on spanish xD, theres many parsing terminology I'm not familiar with :(

You need to tokenize and fully macro expand everything before feeding the evaluator.

For the first case you would need to expand VERBOSE to whatever its value is. Supposing it expands to '1', you would feed it:

Code
[ttNumber, "1"][ttWhiteSpace, " "][ttGreaterEqual, ">="][ttWhiteSpace, " "][ttNumber, "2"][ttEndOfTokens, ""]

For the second case you would need to expand defined(WIN32) and defined(__MINGW32__). Supposing both are defined, you would feed it:

Code
[ttNot, "!"][ttNumber, "1"][ttWhiteSpace, " "][ttOr, "||"][ttWhiteSpace, " "][ttNumber, "1"][ttEndOfTokens, ""]

Since it is a conditional (#if), all you care about is whether the result is 0 or not.

Instead of ttEndOfTokens as finalization, ttNewLine could be also added and handled just the same (must be added to the list of token types too).

If you want to learn more about the preprocessor in order to know what really needs to be implemented, check the C++0x draft chapters 2 (Lexical conventions) and 16 (Preprocessing directives) here.

Offline JGM

  • Lives here!
  • ****
  • Posts: 518
  • Got to practice :)
Re: Code completion doesnt follow #include in struct
« Reply #21 on: March 25, 2011, 11:12:31 pm »
...
You need to tokenize and fully macro expand everything before feeding the evaluator.

For the first case you would need to expand VERBOSE to whatever its value is. Supposing it expands to '1', you would feed it:

Code
[ttNumber, "1"][ttWhiteSpace, " "][ttGreaterEqual, ">="][ttWhiteSpace, " "][ttNumber, "2"][ttEndOfTokens, ""]

For the second case you would need to expand defined(WIN32) and defined(__MINGW32__). Supposing both are defined, you would feed it:

Code
[ttNot, "!"][ttNumber, "1"][ttWhiteSpace, " "][ttOr, "||"][ttWhiteSpace, " "][ttNumber, "1"][ttEndOfTokens, ""]

Since it is a conditional (#if), all you care about is whether the result is 0 or not.

Instead of ttEndOfTokens as finalization, ttNewLine could be also added and handled just the same (must be added to the list of token types too).

If you want to learn more about the preprocessor in order to know what really needs to be implemented, check the C++0x draft chapters 2 (Lexical conventions) and 16 (Preprocessing directives) here.

Thanks for your guidance! I need to fix some of the things to generate the preprocessed code in the exact line positions but without the macros for normal parsing (to get same line numbers and columns as original source code, not sure if it is so necessary). Also I created a parse expression function returning always true since I didn't had code to do that (planning to write it from scratch xD), so does the code you did is GPL (in other words can I use it xD)? Also I need to make macro values evaluation recursive (as you mentioned before) for cases like:

#define blah(x) x*2
#define blah2(y) blah(y)
#define blah3(z) blah2(z)

Other thing I need to implement is the conversion of expressions like x ## x or # x to a valid string or empty string since this data (I think) is not necessary for general parsing.

this is the code that actually handles the processing:

Code
for(unsigned int position=0; position<lines.size(); position++)
        {
            vector<preprocessor_token> tokens = lines[position];

            //Parse macro
            if(tokens[0].token == "#")
            {
                if(deepness == 0 || (deepness > 0 && last_condition_return[deepness]))
                {
                    if(tokens[1].token == "define")
                    {
                        define definition = parse_define(strip_macro_definition(tokens));
                        definition.file = file;
                        definition.line = tokens[2].line;
                        definition.column = tokens[2].column;
                        m_local_defines.push_back(definition);
                    }
                    if(tokens[1].token == "include")
                    {
                        string include_enclosure = tokens[2].token;
                        string include_file = "";
                        file_scope header_scope;

                        if(include_enclosure == "<")
                        {
                            for(unsigned int i=3; i<tokens.size(); i++)
                            {
                                if(tokens[i].token == ">")
                                {
                                    break;
                                }
                                else
                                {
                                    include_file += tokens[i].token;
                                }
                            }

                            m_headers_scope[include_file] = global;
                            header_scope = global;
                        }
                        else
                        {
                            for(unsigned int i=1; i<tokens[2].token.size(); i++)
                            {
                                if(tokens[2].token.at(i) == '"')
                                {
                                    break;
                                }
                                else
                                {
                                    include_file += tokens[2].token.at(i);
                                }
                            }

                            m_headers_scope[include_file] = local;
                            header_scope = local;
                        }

                        if(!is_header_parsed(include_file))
                        {
                            output += parse_file(include_file, header_scope); //To output the processed headers code
                            //parse_file(include_file, header_scope); //parses header without outputting

                            m_headers.push_back(include_file);
                        }
                    }
                    else if(tokens[1].token == "undef")
                    {
                        remove_define(tokens[2].token);
                    }
                    else if(tokens[1].token == "ifdef")
                    {
                        deepness++;
                        if(is_defined(tokens[2].token))
                        {
                            last_condition_return[deepness] = true;
                        }
                        else
                        {
                            last_condition_return[deepness] = false;
                        }
                    }
                    else if(tokens[1].token == "ifndef")
                    {
                        deepness++;
                        if(!is_defined(tokens[2].token))
                        {
                            last_condition_return[deepness] = true;
                        }
                        else
                        {
                            last_condition_return[deepness] = false;
                        }
                    }
                    else if(tokens[1].token == "if")
                    {
                        deepness++;
                        last_condition_return[deepness] = parse_expression(strip_macro_definition(tokens));
                    }
                }

                if(deepness > 0 && (tokens[1].token == "elif" || tokens[1].token == "else" || tokens[1].token == "endif"))
                {
                    if(tokens[1].token == "elif" && last_condition_return[deepness] != true)
                    {
                        last_condition_return[deepness] = parse_expression(strip_macro_definition(tokens));
                    }
                    else if(tokens[1].token == "else" && last_condition_return[deepness] != true)
                    {
                        last_condition_return[deepness] = true;
                    }
                    else if(tokens[1].token == "endif")
                    {
                        last_condition_return.erase(last_condition_return.find(deepness));
                        deepness--;
                    }
                }
            }

            //Parse code
            else
            {
                if(deepness == 0 || (deepness > 0 && last_condition_return[deepness]))
                {
                    unsigned int column = 1;

                    for(unsigned int i=0; i<tokens.size(); i++)
                    {
                        unsigned int columns_to_jump = tokens[i].column - column;

                        if(tokens[i].column <= 0)
                        {
                            columns_to_jump = 0;
                        }
                        else if(tokens[i].column < column)
                        {
                            columns_to_jump = column - tokens[i].column;
                        }

                        for(unsigned int y=0; y<columns_to_jump; y++)
                        {
                            output += " ";
                        }

                        if(tokens[i].type == identifier && is_defined(tokens[i].token))
                        {
                            output += get_define(tokens[i].token).value;
                        }
                        else
                        {
                            output += tokens[i].token;
                        }

                        column = tokens[i].column + tokens[i].token.size();
                    }

                    output += "\n";
                }
            }
        }

return output;

as you can see, the code already handles nested macros correctly as basic ones (#define, #undef, #include, #ifdef, #ifndef) and with your code I would implement the  parse_expression function (like I said returns true by now, no evaluation) for #if and #elif evaluation. The tricky part is going to be recursiveness evaluation of macros.

I think I'm worrying to much about printing the parsed code with same lines since it is impossible on the case of multiple line macros like:

#define declare_table() class table{ \
int blah;\
};

Since these kind of macros are going to affect the line numbering on the output code.

My worry about same line positions was due to the fact of using the library also for refactoring, but a solution could be tough later I guess.

I will try to read the pages you mentioned of the standard draft xD (I bought an e-reader to accompany me on the nights xD)

Thanks again for your feedback!

Edit: Just did a quick look on c++ draft and I completly forgot about #line, #pragma, #error  :shock: but well I think these directives can be safely skipped except for #pragma that may include headers or things like that, what a pain  :lol:

Edit: Trigraph sequences - I knew about them but who would use that???  xD mmm I also forgot about Alternative tokens :P,
uhhh also didnt tought about #include MACRO :S, well this post will remind me on things todo :D
« Last Edit: March 26, 2011, 12:19:05 am by JGM »

Offline Ceniza

  • Developer
  • Lives here!
  • *****
  • Posts: 1441
    • CenizaSOFT
Re: Code completion doesnt follow #include in struct
« Reply #22 on: March 26, 2011, 10:53:55 am »
... so does the code you did is GPL (in other words can I use it xD)?

It is not GPL, it is more like "do as you please, but if it kills your dog do not blame it on me" kind of license. I think it can be mixed with code under the GPL, but you better ask at least 3 lawyers to be sure :P

I would not recommend skipping #line as certain tools make use of it (mostly those that produce code, like lex/yacc or preprocessors themselves), and handling #error would be neat because you could inform the user about it way before hitting the 'build' button (as long as the files are always properly parsed to avoid false positives).

Keeping track of line numbers and files is, of course, extremely important. After all, the idea is for Code::Blocks to make use of it, and that information is vital. I think that making a list of the whole set of tools that want to be implemented, and what is needed for each one of them is the way to go to really know how fine grained line numbering needs to be stored.

Offline JGM

  • Lives here!
  • ****
  • Posts: 518
  • Got to practice :)
Re: Code completion doesnt follow #include in struct
« Reply #23 on: March 26, 2011, 08:20:46 pm »
It is not GPL, it is more like "do as you please, but if it kills your dog do not blame it on me" kind of license. I think it can be mixed with code under the GPL, but you better ask at least 3 lawyers to be sure :P

that's scary xD

I would not recommend skipping #line as certain tools make use of it (mostly those that produce code, like lex/yacc or preprocessors themselves), and handling #error would be neat because you could inform the user about it way before hitting the 'build' button (as long as the files are always properly parsed to avoid false positives).

Mmm so with #error the library should throw an exception.

Keeping track of line numbers and files is, of course, extremely important. After all, the idea is for Code::Blocks to make use of it, and that information is vital. I think that making a list of the whole set of tools that want to be implemented, and what is needed for each one of them is the way to go to really know how fine grained line numbering needs to be stored.

Well, for now when the code is first tokenized columns and line numbers are stored correctly, what I mean is when outputting the pre-processed code for full parsing of it (lexical analysis?).

The output code would need to be re-parsed with the issue of line numbers modified from original source, unless associations are made to previously tokenized original source.

Lets say we have this original code

Code
#include <something.h>
#define class_blah class test {\
char variable[50];\
};

class_blah

But the output of this would look different
Code
class something{
float test;
};

class test {
char variable[50];
};

It would parse as it should, but loosing original positions. We would still know on which files the class definitions were found but with incorrect line numbers and probably columns. My tiny brain can't think of a solution xD

Offline Ceniza

  • Developer
  • Lives here!
  • *****
  • Posts: 1441
    • CenizaSOFT
Re: Code completion doesnt follow #include in struct
« Reply #24 on: March 26, 2011, 10:23:40 pm »
Mmm so with #error the library should throw an exception.

Not necessarily. It could just store it somewhere for later retrieval. The parsing should continue in case it is a false positive.

Well, for now when the code is first tokenized columns and line numbers are stored correctly, what I mean is when outputting the pre-processed code for full parsing of it (lexical analysis?).

The output code would need to be re-parsed with the issue of line numbers modified from original source, unless associations are made to previously tokenized original source.

Lets say we have this original code

Code
#include <something.h>
#define class_blah class test {\
char variable[50];\
};

class_blah

But the output of this would look different
Code
class something{
float test;
};

class test {
char variable[50];
};

It would parse as it should, but loosing original positions. We would still know on which files the class definitions were found but with incorrect line numbers and probably columns. My tiny brain can't think of a solution xD

The preprocessing stage should output tokens, not text. The C++ parser's lexer job would be extremely simple: concatenate string literals into a single string literal token and turn numbers into either integral or floating-point tokens (flags may be needed to specify full type: unsigned, short, int, long, long long, float, double, long double). Identifiers could be turn into keywords here as well if not done before. Every other token would just pass through to the syntax analysis stage.

This is what the whole thing would, roughly, look like:

Preprocessor's Lexer -> Preprocessor -> Lexer -> Syntax analysis + symtab generation -> Semantic analysis.

Preprocessor's Lexer: Turns text into preprocessor tokens. Integral and floating-point values would be just "numbers". Keywords should be read as plain identifiers since the preprocessor does not care about them being a separate thing. File and line information is retrieved here.
Preprocessor: Resolves directives (#include, #if*, ...), discards tokens and builds new tokens when necessary (## and # operations). White spaces (space, newline, comments, ...) are, in theory, discarded as well.
Lexer: Converts "numbers" into proper tokens, concatenates contiguous string literals into a single string literal token and turns identifiers into keywords (the ones that are actually keywords, of course).
Syntax analysis: Checks that everything is properly "written" (class_decl ::= ttClass ttIdentifier ttSemiColon). An Abstract Syntax Tree can be built here, plus a symbols table.
Semantic analysis: Checks that everything makes sense: x = 3; // Is x a symbol in the current or a parent scope? Can it be assigned an integral type in any way (x is not const, x is integral, x has an overload of operator = that can be used, 3 can be turned into x's type and assigned, ...)?

That means some token types would not be seen by the preprocessor because its lexer would not produce them, most token types specifically for the preprocessor would have been consumed before reaching the lexer (at the next stage), and those few ones reaching it would be converted before being fed to the syntax analysis stage.

I hope it is clear enough, although its "roughness".

Offline JGM

  • Lives here!
  • ****
  • Posts: 518
  • Got to practice :)
Re: Code completion doesnt follow #include in struct
« Reply #25 on: March 27, 2011, 12:00:13 am »
Not necessarily. It could just store it somewhere for later retrieval. The parsing should continue in case it is a false positive.

Yep, I thought that after implementing the ErrorException class :D

The preprocessing stage should output tokens, not text.

Ahhh true, and I kept thinking how to do it lol

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5915
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: Code completion doesnt follow #include in struct
« Reply #26 on: March 27, 2011, 07:57:42 am »
This is what the whole thing would, roughly, look like:

Preprocessor's Lexer -> Preprocessor -> Lexer -> Syntax analysis + symtab generation -> Semantic analysis.

Preprocessor's Lexer: Turns text into preprocessor tokens. Integral and floating-point values would be just "numbers". Keywords should be read as plain identifiers since the preprocessor does not care about them being a separate thing. File and line information is retrieved here.
Preprocessor: Resolves directives (#include, #if*, ...), discards tokens and builds new tokens when necessary (## and # operations). White spaces (space, newline, comments, ...) are, in theory, discarded as well.
Lexer: Converts "numbers" into proper tokens, concatenates contiguous string literals into a single string literal token and turns identifiers into keywords (the ones that are actually keywords, of course).
Syntax analysis: Checks that everything is properly "written" (class_decl ::= ttClass ttIdentifier ttSemiColon). An Abstract Syntax Tree can be built here, plus a symbols table.
Semantic analysis: Checks that everything makes sense: x = 3; // Is x a symbol in the current or a parent scope? Can it be assigned an integral type in any way (x is not const, x is integral, x has an overload of operator = that can be used, 3 can be turned into x's type and assigned, ...)?

That means some token types would not be seen by the preprocessor because its lexer would not produce them, most token types specifically for the preprocessor would have been consumed before reaching the lexer (at the next stage), and those few ones reaching it would be converted before being fed to the syntax analysis stage.

I hope it is clear enough, although its "roughness".
very nice info.
But I think things get more  complex on parsing c++, because the c++ language is not context free, so Syntax analysis can not get the correct tree, because it need semantic information. So, we can not create a Bison grammar to parse c++ code, because both syntax and semantic should be combined.

and about the preprocessor side, checking an identifier (to see whether it is a keyword or a general variable name, function name) was really time consuming and context sensitive,  If we skip the expend the #include directive, we always get partial preprocessor result, so the macros may lost, and further #error kind message is also not correct.

how to avoid parsing a header file time from times? do we have a PCH like mechanism?
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline Ceniza

  • Developer
  • Lives here!
  • *****
  • Posts: 1441
    • CenizaSOFT
Re: Code completion doesnt follow #include in struct
« Reply #27 on: March 27, 2011, 10:53:03 am »
But I think things get more  complex on parsing c++, because the c++ language is not context free, so Syntax analysis can not get the correct tree, because it need semantic information. So, we can not create a Bison grammar to parse c++ code, because both syntax and semantic should be combined.

I think you are getting the job of the semantic analysis wrong. Let us say we have this code:

Code
float x = "a value";
++x;

You can build an AST from that, and the symtab will have that x is of type float. When you run the semantic analysis on that is when you will find that both lines have problems: assigning string literal to float, and pre-incrementing a float.

and about the preprocessor side, checking an identifier (to see whether it is a keyword or a general variable name, function name) was really time consuming and context sensitive,  If we skip the expend the #include directive, we always get partial preprocessor result, so the macros may lost, and further #error kind message is also not correct.

Right, I totally forgot to specify that.

During the preprocessing stage you need to build a macro replacements map, or whatever you want to call it. It would turn every identifier in a #define identifier as an "identifier to be replaced afterwards" or, simply, "macro". That map would be indexed by the identifier (the macro's name), and store the macro's type (plain macro or function-like macro), its parameters (for function-like macros), and the plain sequence of tokens that follow (the replacement). Have in mind that that sequence of tokens must NOT be macro expanded when stored.

When the preprocessor finds an identifier, it will search for it in the map. If it is found, build the list of parameters (each parameter being a sequence of non-expanded tokens) (in case it is a function-like macro), and proceed to do the replacement (expand it) over and over again until no more replacements are made. During this stage you need to keep a sort of call stack to properly handle what could otherwise become a recursive replacement (probably leading to an endless loop). Recursion is something the preprocessor must not do (check the standard).

how to avoid parsing a header file time from times? do we have a PCH like mechanism?

Well, you could store the result of preprocessing any file found through a #include. It would be indexed by the full file location, "sub-indexed" by the context plus dependencies, store the macro replacements map and the output (the final list of tokens).

The "sub-indexing" is important for a proper handling. The 'context plus dependencies' refers to all macros that were defined just before the file was #include'd, and their values. It is also important to know which other macros would cause the header to produce a different output (due to #if*). It is rather tricky to get it right, and it may cause the parsing to be a lot slower, although quite accurate. That is why, when programming, preprocessed headers should always be the first ones to be included (so they carry in as little context as possible).

In order to improve speed, as well as to simplify the implementation, the "sub-indexing" could be discarded. In other words: parse it once, store it like that, do not care about context. Handling multiple inclusion turns into the annoying part, though (as per this topic, you may want it, but, most of the time, you will not). [We are still on topic :P]

"Stable" header files (like those that come with the compiler) should be parsed once and stored. You do not want to parse them every single time.

The last two paragraphs, as far as I know, is how the guys at Whole Tomato do it for Visual Assist X.

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5915
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: Code completion doesnt follow #include in struct
« Reply #28 on: March 27, 2011, 02:36:37 pm »
I think you are getting the job of the semantic analysis wrong. Let us say we have this code:

Code
float x = "a value";
++x;

You can build an AST from that, and the symtab will have that x is of type float. When you run the semantic analysis on that is when you will find that both lines have problems: assigning string literal to float, and pre-incrementing a float.

no, I have read some posts/threads on the web, look here:
7 Dealing with Ambiguities
and there are much more about parsing template instantiation code.
this always need semantic (type) information about the current identifier to support the syntax analysis.
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline Ceniza

  • Developer
  • Lives here!
  • *****
  • Posts: 1441
    • CenizaSOFT
Re: Code completion doesnt follow #include in struct
« Reply #29 on: March 27, 2011, 04:48:29 pm »
Quote
The CDT parsers do not compute type information, the result being that some language constructs can be ambiguous, ...

According to that, they delay extracting type information until the semantic analysis stage, which is not practical for C/C++. That is completely unnecessary as the syntax analysis stage knows well what introduces new types. Since the symtab is populated as you build the AST, and you can also populate a "typetab", you can query information right away. It is, therefore, possible to know if x * y is a statement or an expression depending on whether or not x is in the symtab/"typetab". Otherwise, you will have to do the kind of trickery (ambiguity nodes) the CDT guys did.

Templates, on the other hand, require you to use the keyword typename to solve the ambiguity in favor of a statement, otherwise it is an expression.