Author Topic: Check macro usage on every identifier like token: issues and discussions (Read 50039 times)

ollydbg · « **on:** November 10, 2014, 07:41:56 am »

We have already enabling some kind of macro usage check in our parser(see rev9932), but that's not fully.
My idea was try to enable the macro checking on each identifier like tokens, I did some tests, and it won't slow down much of the parsing speed, the result is quite good.
But we still have some issues to solve
1, we have different build target, and different build target have different C Preprocessor definitions, if we still parse the whole C::B project, we can get bad result.
E.g. In Codeblocks.cbp, we have one macro definition in one target

Code

#define type(obj) ((obj)._type)

or even bad

Code

#define type xxx

And we have some other target which can have some code like:

Code

        GdbCmd_SetCatch(DebuggerDriver *driver, const wxString &type, int *resultIndex)

In this case, the "type" was replaced to un-wanted tokens.

2, we have some code like:

Code

// Assumes the buffer bitmap only covers the client area;
// does not prepare the window DC
#define wxBUFFER_CLIENT_AREA        0x02

class WXDLLEXPORT wxBufferedDC : public wxMemoryDC
{
public:
    // Default ctor, must subsequently call Init for two stage construction.
    wxBufferedDC()
        : m_dc(NULL),
          m_buffer(NULL),
          m_style(0)
    {
    }

    // Construct a wxBufferedDC using a user supplied buffer.
    wxBufferedDC(wxDC *dc,
                 wxBitmap& buffer = wxNullBitmap,
                 int style = wxBUFFER_CLIENT_AREA)
        : m_dc(NULL), m_buffer(NULL)
    {
        Init(dc, buffer, style);
    }

When parsing the function arguments(parameters), if macro expansion is enabled on every tokens, we will get

Code

(wxDC *dc, wxBitmap& buffer = wxNullBitmap, int style = 0x02)

instead of

Code

(wxDC *dc, wxBitmap& buffer = wxNullBitmap, int style = wxBUFFER_CLIENT_ARE)

What would you like to see? The former or the later?

3, In the further, I would like to see that all the macro handling was put in the Tokenizer class. (Currently, we have macro handling both in Parserthread and Tokenizer class).

4, Too speed up the macro definition search, I think a hash table (e.g. wxWidgets: wxHashMap Class Reference) is preferred. (Currently, the macro definition is stored in the TokenTree, which is actually a Trie - Wikipedia, the free encyclopedia)

EDIT: Even in the same build target, we can still have different C preprocessor definitions for different translation file, so this seems not easy to solve. One method is that we try to maintain macro definition for each file, or create a include file hierarchy.

EDIT2:
Patches (git patches serial) attached, comments were welcome.

EDIT3:
About issue 1, I think we have discussed here, see Huki's comments Re: Several improvements to Code Completion plugin and Re: Several improvements to Code Completion plugin

Huki · « **Reply #1 on:** April 09, 2015, 07:11:18 pm »

Quote from: ollydbg on April 06, 2015, 02:51:59 am

Hi, Huki, I attach the git patch serials against the latest trunk rev 10182.
Comments are welcome

I've attached some bug fix patches against your last cc_macro serials. Some description:

Fixed = [] , handling when parsing variable declaration. i.e.,

Code: assignment parsing

int var1 = 0; // this var has an assignment
int var2 = 0; // mouse on var2 -> we get "intint var2"
int var3 = 0; // mouse on var3 -> we get "intintint var3"

Code: comma parsing

int num=0, num2=0, num3=0;
// mouse on each variable and we get:
// num -> "int num"
// num2 -> "int, num2"
// num3 -> "int,, num3"

Code: arrays parsing

int array[5]; // vars with [] are not parsed

The rest of the patches should already be clear.

ollydbg · « **Reply #2 on:** April 11, 2015, 04:44:20 pm »

Quote from: Huki on April 09, 2015, 07:11:18 pm

Quote from: ollydbg on April 06, 2015, 02:51:59 am
Hi, Huki, I attach the git patch serials against the latest trunk rev 10182.
Comments are welcome
I've attached some bug fix patches against your last cc_macro serials. Some description:
...

Very nice work, Huki, thanks. I have them applied in my local copy, it fixes some errors of my patch serials.
BTW: I see a trailing spaces in a patch when I run the "git am" command

Code

Applying: CC: fix variable parsing with "=" or "[]"
f:/cb_sf_git/trunk/.git/rebase-apply/patch:52: trailing whitespace.
                         || peek==ParserConsts::oparray_chr
warning: 1 line adds whitespace errors.

Do you have the "remove trailing space" option enabled in C::B?

Any suggestion is that the commit message should have a "*" if it is a major function change or "-" if it is a minor change.
So, it should be:

Code

* CC: xxxxxxx

detailed descriptions

You can wrote as many detailed text as possible in the commit log messages, E.g. see my commit r10157 as an example. I just follow Linux's commit message style, which means a simple line as a subject, and after an empty line, write details, like this: Git Commit Messages : 50/72 Formatting.

EDIT: I have fixed all the mentioned issue above in my local copy, so you don't need to post the patches again.

MortenMacFly · « **Reply #3 on:** April 11, 2015, 10:13:48 pm »

Well guys it would be nice to follow but the patches don't apply in svn. Any chance you make svn compatible patches or a svn branch or work together on a git repo? The repo that I was referring to is terribly old... Or am I missing something?

ollydbg · « **Reply #4 on:** April 14, 2015, 06:25:23 am »

Quote from: MortenMacFly on April 11, 2015, 10:13:48 pm

Well guys it would be nice to follow but the patches don't apply in svn. Any chance you make svn compatible patches or a svn branch or work together on a git repo? The repo that I was referring to is terribly old... Or am I missing something?

Hi, Morten, I attach the svn style patch which is created from this script.

MortenMacFly · « **Reply #5 on:** April 14, 2015, 08:22:39 am »

Quote from: ollydbg on April 14, 2015, 06:25:23 am

Hi, Morten, I attach the svn style patch which is created from this script.

That is nice, but why don't you (two) just work on a GIT repo? I could check this out via SVN and create patches myself. However, it should need to stay in sync with trunk though - so a bit more work is needed. On the pro side, Huki can commit directly and you can sync yourself with him easily.

It thought thats what the branch in the "OBF repo" was for...?!

ollydbg · « **Reply #6 on:** June 19, 2015, 04:23:23 pm »

Hi, Morten, this is a reply after two months... How time flies.

I think not much people are testing my patch serials. I'm considering what is the time to let those patches into our SVN repo. I don't have a clean plan, maybe, we can let them in, and we may have more testers through nightly build versions.

What's your or other guy's idea?

oBFusCATed · « **Reply #7 on:** June 19, 2015, 08:30:18 pm »

If you add some tests to the cctest project I am totally fine if you push these in trunk.
Then you can fix bugs and add more tests. So you won't re-introduce old problems by fixing the new ones.

If you're adding tests it should be relatively safe. Also you should be around to be able to fix if something comes up.

ollydbg · « **Reply #8 on:** June 20, 2015, 08:13:24 am »

Quote from: oBFusCATed on June 19, 2015, 08:30:18 pm

If you add some tests to the cctest project I am totally fine if you push these in trunk.
Then you can fix bugs and add more tests. So you won't re-introduce old problems by fixing the new ones.

If you're adding tests it should be relatively safe. Also you should be around to be able to fix if something comes up.

OK, thanks for the suggestion, I will take two steps.
1, add more tests these days.
2, commit my patch serials only if they don't cause any regression.

ollydbg · « **Reply #9 on:** June 20, 2015, 08:40:54 am »

Currently, I see only one regression (as of revision 10341, before my patch and after applying my patch) is here:

Code

--- C:/before.txt	Sat Jun 20 14:14:40 2015
+++ C:/after.txt	Sat Jun 20 14:09:01 2015
@@ -56,14 +56,14 @@
 Total 9 tests, 9 PASS, 0 FAIL
 --------------------------------------------------------
 ********************************************************
   Testing in file: F:\cb_sf_git\trunk\src\plugins\codecompletion\testing\cc_function_ptr.cpp
 ********************************************************
--PASS: Fun  FuncArray
+*FAIL: Fun  FuncArray
 -PASS: foo  foo
 --------------------------------------------------------
-Total 2 tests, 2 PASS, 0 FAIL
+Total 2 tests, 1 PASS, 1 FAIL
 --------------------------------------------------------
 ********************************************************
   Testing in file: F:\cb_sf_git\trunk\src\plugins\codecompletion\testing\cc_function_ptr_com_interface.cpp
 ********************************************************
 -PASS: factory->  QueryInterface

Simplify the test case, I get such test file: ccc_bug.cpp

Code


int (*FuncArray[10][20]) ();

//Fun  //FuncArray

I will look into it.

ollydbg · « **Reply #10 on:** June 20, 2015, 09:13:29 am »

OK, I found the cause of this regression, I get the parenthesis is L"(* FuncArray [ 10 ] [ 20 ])", and if without my patch, it don't have a space after the "FuncArray", this is because in my patch serials, I have change the function of reading the parenthesis. (The old code is hard to understand for me, and using raw wxChar pointers).

Then in the function:

Code

        // pattern: m_Str AAA (*BBB) (...);
        if (peek == ParserConsts::semicolon && pos != wxNOT_FOUND)
        {
            name.RemoveLast();  // remove ")"
            name.Remove(0, pos+1).Trim(false); // remove "(* "

            // pattern: m_Str AAA (*BBB[X][Y]) (...);
            pos = name.find(ParserConsts::oparray_chr);
            if (pos != wxNOT_FOUND)
                name.Remove(pos);

            TRACE(_T("HandleFunction() : Add token name='")+name+_T("', args='")+args+_T("', return type='") + m_Str+ _T("'"));
            Token* newToken =  DoAddToken(tkFunction, name, lineNr, 0, 0, args);

We pass the name as "FuncArray " (note a space after "FuncArray".

Now, I think I need to return a more compact form like "(*FuncArray[10][20])".

EDIT: the old code for reading parenthesis dose not consider the macro expansion in its content, it just find the "(" and ")" pair in the raw wxString, while in my patch serials, I do macro expansion on every token if possible.

ollydbg · « **Reply #11 on:** June 20, 2015, 11:28:25 am »

Now, here comes the function I need to implement, it looks like "space" are added only in some very special cases.

Code

void Tokenizer::ReadParentheses(wxString& str)
{
    // we create a local buffer here, so the data is copied from m_Buffer to buffer
    // once the local buffer is full, we append the content in local buffer to the result str
    static const size_t maxBufferLen = 4093;
    // local buffer
    wxChar buffer[maxBufferLen + 3];
    buffer[0] = _T('$'); // avoid segfault error, because we have some *(p - 1) = x in the code
    wxChar* realBuffer = buffer + 1;
    // p points to the local buffer
    wxChar* p = realBuffer;

    int level = 0; // brace level of '(' and ')'

    while (NotEOF())
    {
        while (SkipComment())
            ;
        wxChar ch = CurrentChar();

        while (ch == _T('#')) // do not use if
        {
            const PreprocessorType type = GetPreprocessorType();
            if (type == ptOthers)
                break;
            HandleConditionPreprocessor(type);
            ch = CurrentChar();
        }

        const unsigned int startIndex = m_TokenIndex;

        switch(ch)
        {
        case _T('('):
            {
                ++level;
                *p = ch;
                ++p;
            }
            break;

        case _T(')'):
            {
                if (*(p - 1) <= _T(' '))
                    --p; // if previous char is a space, we overwrite the space with a ')', so we save one char
                --level;
                *p = ch;
                ++p;
            }
            break;

        case _T('\''):
        case _T('"'):
            {
                MoveToNextChar();     // skip the leading '"' or '\''
                SkipToStringEnd(ch);  // m_TokenIndex point to the trailing '"' or '\'' now
                MoveToNextChar();     // move to the char after trailing char
                const size_t writeLen = m_TokenIndex - startIndex; // the string length
                const size_t usedLen = p - realBuffer;
                if (usedLen + writeLen > maxBufferLen) // there is not enough space left to write the string
                {
                    if (writeLen > maxBufferLen) // to big string?
                    {
                        TRACE(_T("ReadParentheses(): Catch exception 1: %lu"), static_cast<unsigned long>(writeLen));
                        return;
                    }

                    if (p != realBuffer) // add the used string to the result str
                    {
                        str.Append(realBuffer, usedLen);
                        p = realBuffer;  // adjust the p, as it was an empty buffer
                    }
                    // append the string to str (not write the string to the buffer)
                    str.Append((const wxChar*)m_Buffer + startIndex, writeLen);
                }
                else
                {   // the buffer can hold the string, so copy it to the buffer
                    memcpy(p, (const wxChar*)m_Buffer + startIndex, writeLen * sizeof(wxChar));
                    p += writeLen;
                }

                continue;
            }
            break;

        case _T(','): // if there is a space before the comma, we just remove it, but we should add a space after the comma
        case _T('*'):
        case _T('&'):
            {
                if (*(p - 1) <= _T(' '))
                    --p;

                *p = ch;
                *++p = _T(' ');
                ++p;
            }
            break;

        case _T('='):
            {
                if (*(p - 1) <= _T(' '))
                {
                    *p = _T('=');
                    // Don't add a space after '=' sign, in case another '=' follows it
                    // (see how the 'else' block below works).
                    //*++p = _T(' ');
                    ++p;
                }
                else // special handling of "==" "=!" "=>" and "=<"
                {
                    switch (*(p - 1))
                    {
                    case _T('='):
                    case _T('!'):
                    case _T('>'):
                    case _T('<'):
                        {
                            *p = _T('=');
                            *++p = _T(' ');
                            ++p;
                        }
                        break;

                    default:
                        {
                            *p = _T(' ');
                            *++p = _T('=');
                            *++p = _T(' ');
                            ++p;
                        }
                        break;
                    }
                }
            }
            break;

        case _T(' '): // we only add a space if the former char is not a space, also not a '('
        case _T('\r'): // the tab char are regard as space
        case _T('\t'):
            {
                if (*(p - 1) != _T(' ') && *(p - 1) != _T('('))
                {
                    *p = _T(' ');
                    ++p;
                }
            }
            break;

        case _T('\n'): // we need keep the \n for records paras correct position
            if (*(p - 1) == _T(' '))
                --p;
            if (*(p - 1) != _T('('))
            {
                *p = ch;
                ++p;
            }
            break;

        default: // any other case, we can directly copy the char
            {
                *p = ch;
                ++p;
            }
            break;
        }

        if (p >= realBuffer + maxBufferLen)
        {
            str.Append(realBuffer, p - realBuffer);
            p = realBuffer;
        }

        MoveToNextChar();

        if (level == 0)
            break;
    }//while (NotEOF())

    if (p > realBuffer)
        str.Append(realBuffer, p - realBuffer);
    TRACE(_T("ReadParentheses(): %s, line=%u"), str.wx_str(), m_LineNumber);
    if (str.Len() > 512)
    {   TRACE(_T("ReadParentheses(): Catch exception 2: %lu"), static_cast<unsigned long>(str.Len())); }
}

So, I need to follow this way.

ollydbg · « **Reply #12 on:** June 20, 2015, 04:04:08 pm »

OK, the only regression is fixed now.
I have attach the full patches (include git style patch serials, or a single svn patch)

ollydbg · « **Reply #13 on:** July 01, 2015, 05:11:52 pm »

One issue for macro replacement parsing is that if you have such code:

Code

#define uint8 unsigned char
void f(uint8 aaa);

Then the parser will see this function declaration:

Code

void f(unsigned char aaa);

This is because we try to expand every macro usage.

Does this cause some issue?

oBFusCATed · « **Reply #14 on:** July 01, 2015, 09:41:14 pm »

Would you this problem affect the autocompletion, calltip and symbol browser?

ollydbg · « **Reply #15 on:** July 02, 2015, 02:49:19 am »

Quote from: oBFusCATed on July 01, 2015, 09:41:14 pm

Would you this problem affect the autocompletion, calltip and symbol browser?

Sure, yes. because the macro expansion happens in the lower tokenizer level, so that the high level parserthread only see the expanded tokens.

ollydbg · « **Reply #16 on:** July 11, 2015, 04:43:41 pm »

Quote from: ollydbg on July 02, 2015, 02:49:19 am

Quote from: oBFusCATed on July 01, 2015, 09:41:14 pm
Would you this problem affect the autocompletion, calltip and symbol browser?
Sure, yes. because the macro expansion happens in the lower tokenizer level, so that the high level parserthread only see the expanded tokens.

I take some time to do a simple research, I think macro expansion shouldn't be a problem.
Let me put the example code here again.

Here, I think a good code style is not use the macro definition to give a type a new name, the best way here is using typedef.
So, if we have the code:

Code

typedef unsigned char uint8;
void f(uint8 aaa);

Then, the parser still see the "void f(uint8 aaa);"

I see that under MSVC, many types are actually typedef, such as "BOOL", "LPCTSTR".

BTW: I don't see much response about this patches serials

, so this is a "ping" for further comments, thanks.

MortenMacFly · « **Reply #17 on:** July 28, 2015, 01:28:48 pm »

Quote from: ollydbg on July 11, 2015, 04:43:41 pm

BTW: I don't see much response about this patches serials , so this is a "ping" for further comments, thanks.

I didn't see there was an update. Unfortunately, all the time you apply patches in trunk this breaks your patch series here. This makes it hard to test... I'll try again with v6 from this thread (hoping its that last version and it applies well on trunk).

MortenMacFly · « **Reply #18 on:** July 28, 2015, 01:37:47 pm »

Quote from: MortenMacFly on July 28, 2015, 01:28:48 pm

I'll try again with v6 from this thread (hoping its that last version and it applies well on trunk).

Well... unfortunately:

Quote

C:\Devel\CodeBlocks\src\plugins\codecompletion\parser\parserthread.cpp: In member function 'void ParserThread::DoParse()':
C:\Devel\CodeBlocks\src\plugins\codecompletion\parser\parserthread.cpp:803:38: error: 'tsSkipNone' was not declared in this scope
m_Tokenizer.SetState(tsSkipNone); // don't want to skip equals

...what to do now?

ollydbg · « **Reply #19 on:** July 29, 2015, 03:20:18 am »

Quote from: MortenMacFly on July 28, 2015, 01:37:47 pm

Quote from: MortenMacFly on July 28, 2015, 01:28:48 pm
I'll try again with v6 from this thread (hoping its that last version and it applies well on trunk).
Well... unfortunately:
Quote
C:\Devel\CodeBlocks\src\plugins\codecompletion\parser\parserthread.cpp: In member function 'void ParserThread::DoParse()':
C:\Devel\CodeBlocks\src\plugins\codecompletion\parser\parserthread.cpp:803:38: error: 'tsSkipNone' was not declared in this scope
m_Tokenizer.SetState(tsSkipNone); // don't want to skip equals
...what to do now?

I can simply rebase patches on the trunk and release the new patches, but currently I have build errors, see: Re: exchndl related patch because win32/lib folder is empty in SVN

ollydbg · « **Reply #20 on:** July 29, 2015, 07:33:08 am »

OK, this is updated git and svn style patches against SVN revision 10373.

MortenMacFly · « **Reply #21 on:** July 29, 2015, 07:37:11 am »

Quote from: ollydbg on July 29, 2015, 07:33:08 am

OK, this is updated git and svn style patches against SVN revision 10373.

Yes, these one seem to work... still compiling...

MortenMacFly · « **Reply #22 on:** August 30, 2015, 09:49:58 am »

Quote from: MortenMacFly on July 29, 2015, 07:37:11 am

Yes, these one seem to work... still compiling...

Its broken again after the last commits. I wonder why you don't simply commit these, too? It seems to work for me - I have no hangs whats-o-ever...

Otherwise, mind posting an update here? :-)

ollydbg · « **Reply #23 on:** August 30, 2015, 10:00:52 am »

Quote from: MortenMacFly on August 30, 2015, 09:49:58 am

Quote from: MortenMacFly on July 29, 2015, 07:37:11 am
Yes, these one seem to work... still compiling...
Its broken again after the last commits. I wonder why you don't simply commit these, too? It seems to work for me - I have no hangs whats-o-ever...

OK, I will commit them, I just want to see it has not regressions.

ollydbg · « **Reply #24 on:** August 31, 2015, 02:18:09 am »

Quote from: ollydbg on August 30, 2015, 10:00:52 am

Quote from: MortenMacFly on August 30, 2015, 09:49:58 am
Quote from: MortenMacFly on July 29, 2015, 07:37:11 am
Yes, these one seem to work... still compiling...
Its broken again after the last commits. I wonder why you don't simply commit these, too? It seems to work for me - I have no hangs whats-o-ever...
OK, I will commit them, I just want to see it has not regressions.

Done. Thanks for everyone!

Huki · « **Reply #25 on:** October 25, 2015, 07:58:25 pm »

Hi, I'm back...

Quote from: ollydbg on June 19, 2015, 04:23:23 pm

Hi, Morten, this is a reply after two months... How time flies.
I think not much people are testing my patch serials. I'm considering what is the time to let those patches into our SVN repo. I don't have a clean plan, maybe, we can let them in, and we may have more testers through nightly build versions. What's your or other guy's idea?

So far it has been very stable for me. I just checked out the latest revision and found these patches have already made into master.. that's good.

I found only one regression, due to commit r10230, but this is not related to your cc_macro patches.

Well, I tried to follow Morten's earlier advice. I have now forked Obfuscated's github repo. All my commits are available here, in branch "huki". On the top you can see 3 new bugfixes against the latest master. Now, do you want me to continue posting them in the forum as patches, or should I issue a pull request? (I can create a new branch with those 3 commits.)

ollydbg · « **Reply #26 on:** October 26, 2015, 06:05:13 am »

Hi, huki, thanks, and I will clone your branch, and see/test your three commits.

Huki · « **Reply #27 on:** October 26, 2015, 07:54:15 pm »

I have added test cases for my fixes (this commit).

ollydbg · « **Reply #28 on:** October 29, 2015, 02:59:14 pm »

Quote from: Huki on October 26, 2015, 07:54:15 pm

I have added test cases for my fixes (this commit).

Thanks, but I think the file "cc_loop_variables.cpp" currently doesn't work under our cctest framework, since the cctest can only test codecompletion in the global scope. In our normal CodeCompletion plugin, we have some other methods to retrieve the scope information.

ollydbg · « **Reply #29 on:** October 29, 2015, 03:22:28 pm »

I see other works on your github branch, which use CC to support the preprocessor of Scintilla's gray out of inactive conditional preprocessor branch. I don't know the details, but it looks like you are explicitly load a dll(shared library) from the sdk or src. My idea is that currently, the SDK.dll is a bridge, and all plugin should implement the interface, I have some simple diagram added days ago in the wiki, see Plugin structure of C::B - CodeBlocks.

ollydbg · « **Reply #30 on:** October 29, 2015, 04:01:56 pm »

Code

void Function1(void)
{
    int some_int;
    for (int for_int = 0; for_int < 5; ++for_int)
    {
        //  //some_int
        //for   //for_int
    }
}

void Function2(void)
{
    int value1 = 0;
    int value2 = 0;

    // hover on "value1"
    if (!value1) {}
    // hover on "value1" and "value2"
    if (value1 && value2) {}

    // Above tooltips will be shown correctly because the statements
    // within if(...), do(...), etc are NOT parsed entirely.
    // Revision 10230 "fixes" this problem, but now, parsing these statements
    // introduces the following regression.
    //if (!value1) {}   // adds a new token "! value1"
    //if (value1 && value2){}   // adds a new token "value1&& value2"
}

It looks like if (xxx), where xxx is an expression.
and the Code::Blocks / Tickets / #145 Code completion ignores parameters of catch-clauses, the test code I use:

Code

class E
{
    int a;
};

int main()
{
    try
    {
    }
    catch(E exp123)
    {
        exp|
    }
    return 0;
}

Where catch(yyy), here yyy is much like function arguments, which is not an expression.

And the void ParserThread::HandleConditionalArguments() is mainly handling the "xxx", not "yyy".

ollydbg · « **Reply #31 on:** November 06, 2015, 03:27:25 pm »

Quote

* CC: [huki] parser - fix for function pointer parsing with assignment.

I think remove the ';' check is not quite good, you just want to support some patters like:

Code

m_Str AAA (*BBB) (...);
m_Str AAA (*BBB) (...) = some_function;

Why not just check if the peek is either ';' or '='?
I think there are other cases which could not lead to a function pointer, such as C's Type Casting.

Code

m_Str AAA (*typeA) (*typeB)(*typeC)....

Or maybe, there are other cases.

Huki · « **Reply #32 on:** November 06, 2015, 10:26:03 pm »

Quote from: ollydbg on November 06, 2015, 03:27:25 pm

Quote
* CC: [huki] parser - fix for function pointer parsing with assignment.

I think remove the ';' check is not quite good, you just want to support some patters like:

Code
m_Str AAA (*BBB) (...);
m_Str AAA (*BBB) (...) = some_function;
Why not just check if the peek is either ';' or '='?
I think there are other cases which could not lead to a function pointer, such as C's Type Casting.
Code
m_Str AAA (*typeA) (*typeB)(*typeC)....
Or maybe, there are other cases.

I agree, checking for both ';' and '=' should be ok. Modified patch:

Code

From f01ada3e356006475787c481316f405ab47359f9 Mon Sep 17 00:00:00 2001
From: huki <gk7huki@gmail.com>
Date: Wed, 8 Apr 2015 08:12:47 +0530
Subject: [PATCH 15/17] CC: [huki] parser - fix for function pointer parsing
 with assignment.

---
 src/plugins/codecompletion/parser/parserthread.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/plugins/codecompletion/parser/parserthread.cpp b/src/plugins/codecompletion/parser/parserthread.cpp
index 813bdd5..d3e6cdd 100644
--- a/src/plugins/codecompletion/parser/parserthread.cpp
+++ b/src/plugins/codecompletion/parser/parserthread.cpp
@@ -2227,7 +2227,7 @@ void ParserThread::HandleFunction(wxString& name, bool isOperator, bool isPointe
         int pos = name.find(ParserConsts::ptr);
 
         // pattern: m_Str AAA (*BBB) (...);
-        if (peek == ParserConsts::semicolon && pos != wxNOT_FOUND)
+        if (pos != wxNOT_FOUND && (peek == ParserConsts::semicolon || peek == ParserConsts::equals))
         {
             name.RemoveLast();  // remove ")"
             name.Remove(0, pos+1).Trim(false); // remove "(* "
@@ -2235,7 +2235,7 @@ void ParserThread::HandleFunction(wxString& name, bool isOperator, bool isPointe
             // pattern: m_Str AAA (*BBB[X][Y]) (...);
             pos = name.find(ParserConsts::oparray_chr);
             if (pos != wxNOT_FOUND)
-                name.Remove(pos);
+                name.Remove(pos).Trim(true);
 
             TRACE(_T("HandleFunction() : Add token name='")+name+_T("', args='")+args+_T("', return type='") + m_Str+ _T("'"));
             Token* newToken =  DoAddToken(tkFunction, name, lineNr, 0, 0, args);
-- 
2.1.4

I added the Trim(true) for safety, in case the name contains a trailing space. But I see you already handled that case in the tokenizer's ReadParentheses (see your commit). So feel free to ignore this part.
Then, the only change is to test for both ";" and "=".

Huki · « **Reply #33 on:** November 06, 2015, 10:48:40 pm »

Quote from: ollydbg on October 29, 2015, 04:01:56 pm

Code

void Function2(void)
{
    int value1 = 0;
    int value2 = 0;

    // hover on "value1"
    if (!value1) {}
    // hover on "value1" and "value2"
    if (value1 && value2) {}

    // Above tooltips will be shown correctly because the statements
    // within if(...), do(...), etc are NOT parsed entirely.
    // Revision 10230 "fixes" this problem, but now, parsing these statements
    // introduces the following regression.
    //if (!value1) {}   // adds a new token "! value1"
    //if (value1 && value2){}   // adds a new token "value1&& value2"
}

It looks like if (xxx), where xxx is an expression.
and the Code::Blocks / Tickets / #145 Code completion ignores parameters of catch-clauses, the test code I use:

Code

class E
{
    int a;
};

int main()
{
    try
    {
    }
    catch(E exp123)
    {
        exp|
    }
    return 0;
}

Where catch(yyy), here yyy is much like function arguments, which is not an expression.

And the void ParserThread::HandleConditionalArguments() is mainly handling the "xxx", not "yyy".

Then we can add a parameter "bool parseArguments" to HandleConditionalArguments(), which will be false by default for 'if', 'do', 'while' and can be set to true to handle catch expressions. Is that ok with you?

ollydbg · « **Reply #34 on:** November 07, 2015, 06:22:00 am »

Quote from: Huki on November 06, 2015, 10:48:40 pm

...
Then we can add a parameter "bool parseArguments" to HandleConditionalArguments(), which will be false by default for 'if', 'do', 'while' and can be set to true to handle catch expressions. Is that ok with you?

I totally agree on this idea. Thanks.

Quote from: Huki on November 06, 2015, 10:26:03 pm

...
I agree, checking for both ';' and '=' should be ok. Modified patch:
...
I added the Trim(true) for safety, in case the name contains a trailing space. But I see you already handled that case in the tokenizer's ReadParentheses (see your commit). So feel free to ignore this part.
Then, the only change is to test for both ";" and "=".

Thanks for the modified patch, I'm OK with this change. This is what I'm going to commit:

Code

From 23346fed9a26273fbde0f9f859cac9da0e31ec2d Mon Sep 17 00:00:00 2001
From: huki <gk7huki@gmail.com>
Date: Wed, 8 Apr 2015 08:12:47 +0530
Subject: * CC: parser - fix for function pointer parsing with assignment.
 (Thanks Huki)

Parsing code pattern:
m_Str AAA (*BBB) (...) = some_function;
See discussion here:
http://forums.codeblocks.org/index.php/topic,19769.msg140852.html#msg140852

diff --git a/src/plugins/codecompletion/parser/parserthread.cpp b/src/plugins/codecompletion/parser/parserthread.cpp
index 813bdd5..d86cfee 100644
--- a/src/plugins/codecompletion/parser/parserthread.cpp
+++ b/src/plugins/codecompletion/parser/parserthread.cpp
@@ -2227,15 +2227,17 @@ void ParserThread::HandleFunction(wxString& name, bool isOperator, bool isPointe
         int pos = name.find(ParserConsts::ptr);
 
         // pattern: m_Str AAA (*BBB) (...);
-        if (peek == ParserConsts::semicolon && pos != wxNOT_FOUND)
+        // pattern: m_Str AAA (*BBB) (...) = some_function;
+        if (pos != wxNOT_FOUND && (peek == ParserConsts::semicolon || peek == ParserConsts::equals))
         {
             name.RemoveLast();  // remove ")"
             name.Remove(0, pos+1).Trim(false); // remove "(* "
 
             // pattern: m_Str AAA (*BBB[X][Y]) (...);
+            // Trim(true) for safety, in case the name contains a trailing space
             pos = name.find(ParserConsts::oparray_chr);
             if (pos != wxNOT_FOUND)
-                name.Remove(pos);
+                name.Remove(pos).Trim(true);
 
             TRACE(_T("HandleFunction() : Add token name='")+name+_T("', args='")+args+_T("', return type='") + m_Str+ _T("'"));
             Token* newToken =  DoAddToken(tkFunction, name, lineNr, 0, 0, args);

ollydbg · « **Reply #35 on:** November 07, 2015, 07:33:44 am »

Quote

CC: [huki] parser - fixed handling of assignment within for loop.

I'm OK with this commit.
With this patch, we can handle the pattern:

Code

for(int i = 0; ...; ...)...

And correctly recognize "i" as a variable.
I will commit soon. This is an enhancement on the original patch, see revision 8700, and the related patch page: Patch 3345 - Code::Blocks History

But I see in the function void ParserThread::HandleForLoopArguments()'s body:

Code

...
        if (peek == ParserConsts::comma)
        {
            smallTokenizer.GetToken(); // eat comma
            createNewToken = true;
        }
        else if (peek == ParserConsts::colon
                 || peek == ParserConsts::semicolon
                 || peek.empty())
        {
            createNewToken = true;
            finished = true; // after this point there will be no further declarations
        }
...

It looks like handling colon is not quite correct. For example:

Code

for ( SomeNamespace::SomeClass i = 0; ...)

In this case, it should add a token "i", which the type "SomeNamespace::SomeClass".
The current method is try to construct a small buffer(smallTokenizer), which contains:

Code

( SomeNamespace::SomeClass i = 0; ...)

and run some parsing method on it.

My idea could be:
1, Make the buffer "( SomeNamespace::SomeClass i = 0; ...)", and construct a new Parserthread object
2, call DoParse() function recursively on the buffer, which is quite similar with how we parse the class definitions

Code

class AAA
{
     // body of the class definition, we recursively to call the DoParse() function.
     // So that all tokens are marked as the child token of the Token "AAA".
     // once the "}" is met, the Doparse() will returned to its high level DoParse().
}

3, if possible, we can let the Token "i" be a child Token of the current Token(ideally the Token "for"), although all the tokens are temporary tokens.

In the main loop of the DoParse(), the "SomeNamespace::SomeClass i = 0;" will be correctly recognized as a Token "i".
This is the way we can handle function parameters, or the catch(xxx) clause, as we have already discussed.

ollydbg · « **Reply #36 on:** November 07, 2015, 11:00:05 am »

@Huki, your patches are in trunk now, thanks.

News:

Author Topic: Check macro usage on every identifier like token: issues and discussions (Read 50039 times)