Is it possible for the parser to support newlib prototypes?

Developer forums (C::B DEVELOPMENT STRICTLY!) > CodeCompletion redesign

<< < (3/12) > >>

Huki:

--- Quote from: ollydbg on September 07, 2014, 04:38:10 pm ---Hi, Huki, thanks for the explanation, when testing and reviewing your patch, I see this comment which I'm not clear:

--- Code: --- // NOTE: ReplaceFunctionLikeMacro will recursively expand all defines and macro calls,
   // but will not expand macro names, which is what we want
   smallTokenizer.ReplaceFunctionLikeMacro(tk);
   tk = tree->at(tree->TokenExists(smallTokenizer.GetToken(), -1, tkFunction|tkMacroDef|tkVariable));
   if (tk && smallTokenizer.PeekToken().empty()) // only if the expanded result is a single token
   token = tk;

--- End code ---
What does the "will not expand macro names" in comments means? as I see, ReplaceFunctionLikeMacro() did a full expansion until nothing is expanded.

--- End quote ---
Hi,
Consider a macro definition like this:

--- Code: ---#define MACRO(_x) ...
--- End code ---
It must be used like this (this is a macro call):

--- Code: ---MACRO(5);
--- End code ---

Now we have a define pointing to this macro:

--- Code: ---#define DEFINE MACRO
--- End code ---
Note that there is no "()" after MACRO.. so this is not a valid "macro call", just the macro name is used alone. In normal code this is illegal and probably an error, but when used like this in the define, it means the preprocessor token DEFINE should be used like a macro:

--- Code: ---DEFINE(5); // same as MACRO(5);
--- End code ---
So to qualify as a macro name, the macro definition should have arguments (or empty parentheses), but the macro usage should be without any parentheses.

In ReplaceFunctionLikeMacro() I have added this test to make sure such macro names don't get expanded (expanding them is illegal anyway).

--- Code: ---@@ -1954,16 +1962,26 @@ bool Tokenizer::GetMacroExpendedText(const Token* tk, wxString& expandedText)
[...]
+ // don't replace anything if the arguments are missing
+ if (!SplitArguments(actualArgs))
   return false;
--- End code ---

BTW: valid preprocessor tokens will still be expanded (i.e., ordinary preprocessor tokens that have no parentheses in the definition itself, like #define DEFINE ...
I added that feature too in the patch:

--- Code: ---@@ -1937,6 +1938,13 @@ bool Tokenizer::GetMacroExpendedText(const Token* tk, wxString& expandedText)
   if (tk->m_FullType.Find(tk->m_Name) != wxNOT_FOUND)
   return false;

+ // if not even "()" is found [in the definition] then it's a normal preprocessor define, so handle as such
+ if (tk->m_Args.IsEmpty())
+ {
+ expandedText = tk->m_FullType;
+ return true; // return true for ReplaceBufferText()
+ }
+

--- End code ---

Also valid macro calls will get expanded:

--- Code: ---#define GLEW_GET_FUN(x) x

// now
#define glMultiDrawElements GLEW_GET_FUN(__glewMultiDrawElements)
// becomes
#define glMultiDrawElements __glewMultiDrawElements

// show calltip for __glewMultiDrawElements(...), which can be
// a function, macro, function ptr or typedef'd function ptr
// (currently function ptr are not parsed properly in DoParse() so
// it's not supported for now. But typedef'd func ptrs are supported)
glMultiDrawElements(

--- End code ---

Likewise we have function calls and function names, see some example code:

--- Code: ---// MessageBoxA is a function name
// it means we can call MessageBox() and get calltip for MessageBoxA()
#define MessageBox MessageBoxA

// MessageBoxA is a function CALL
// it means we cannot use MessageBox like a call, it's just a define.
#define MessageBox MessageBoxA(...)

--- End code ---

ollydbg:

--- Quote from: Huki on September 07, 2014, 06:06:51 pm ---Hi,
Consider a macro definition like this:

--- Code: ---#define MACRO(_x) ...
--- End code ---
It must be used like this (this is a macro call):

--- Code: ---MACRO(5);
--- End code ---

Now we have a define pointing to this macro:

--- Code: ---#define DEFINE MACRO
--- End code ---
Note that there is no "()" after MACRO.. so this is not a valid "macro call", just the macro name is used alone. In normal code this is illegal and probably an error,
--- End quote ---
Thanks for the explain.
It looks like this code above is valid, and I just create a simple test code, and it build fine under MinGW G++

--- Code: ---#define MACRO(_x) _x
#define DEFINE MACRO

int main()
{
   int x;
   DEFINE(x);
   return 0;
}

--- End code ---

--- Quote ---but when used like this in the define, it means the preprocessor token DEFINE should be used like a macro:

--- Code: ---DEFINE(5); // same as MACRO(5);
--- End code ---
So to qualify as a macro name, the macro definition should have arguments (or empty parentheses), but the macro usage should be without any parentheses.

--- End quote ---
What does the last sentence means? I don't get the idea.

--- Quote ---In ReplaceFunctionLikeMacro() I have added this test to make sure such macro names don't get expanded (expanding them is illegal anyway).

--- End quote ---
Sorry, I still not quite understand this sentence. (Maybe, dur to my poor English.....)

--- Quote ---
--- Code: ---@@ -1954,16 +1962,26 @@ bool Tokenizer::GetMacroExpendedText(const Token* tk, wxString& expandedText)
[...]
+ // don't replace anything if the arguments are missing
+ if (!SplitArguments(actualArgs))
   return false;
--- End code ---

--- End quote ---
I see that the only usage of the GetMacroExpendedText() is here: (BTW: typo here? should be "expanded text"?)

--- Code: ---bool Tokenizer::ReplaceFunctionLikeMacro(const Token* tk, bool updatePeekToken)
{
   wxString macroExpandedText;
   if ( GetMacroExpendedText(tk, macroExpandedText) )
   return ReplaceBufferText(macroExpandedText, updatePeekToken);
   return false;
}

--- End code ---
But as I see that the function ReplaceFunctionLikeMacro is called in some condition that tk->m_Args is not empty. It looks like m_Args is used to distinguish whether a macro definition is variable-like or function-like, but if I realize that even function-like macro definitions are allowed to use empty formal argument list, so the condition is not true any more.

GetMacroExpendedText did some tricks, it just put the formal arguments before actual arguments, so the buffer becomes

--- Code: ---..... ( formal arguments ) ( actual arguments ) ....
   ^ --- m_TokenIndex

--- End code ---

Oh, so for a function-like macro definition which has empty argument list, the m_Args is "()". SplitArguments(actualArgs) function will still return true, but both the actualArgs and the formalArgs are empty, in this case, we can directly return the macro's definition string without any replacement.

BTW: it looks like ReplaceFunctionLikeMacro() only runs one level replacement, if you have several replacement rules (A -> B ->C), call this function only make a (A->B), but I think (B->C) will be happens after some GetToken() call. So, it looks like our macro expansion is not similar as the C language standard.

--- Quote ---BTW: valid preprocessor tokens will still be expanded (i.e., ordinary preprocessor tokens that have no parentheses in the definition itself, like #define DEFINE ...
I added that feature too in the patch:

--- Code: ---@@ -1937,6 +1938,13 @@ bool Tokenizer::GetMacroExpendedText(const Token* tk, wxString& expandedText)
   if (tk->m_FullType.Find(tk->m_Name) != wxNOT_FOUND)
   return false;

+ // if not even "()" is found [in the definition] then it's a normal preprocessor define, so handle as such
+ if (tk->m_Args.IsEmpty())
+ {
+ expandedText = tk->m_FullType;
+ return true; // return true for ReplaceBufferText()
+ }
+

--- End code ---

--- End quote ---
Yes, this is some kind of expanding variable-like macro.

Edit: I think I still need time to review the CC code and your patches.

ollydbg:
Some test:
Here is the test code

--- Code: ---#define AAA BBB
#define BBB CCC
#define CCC DDD
#define DDD EEE

int AAA;

--- End code ---

Nornally, when parsed, you get a variable Token named "AAA", since CC don't check every token for macros, so AAA does not trigger macro replacement.

If you create a user replacement rule by adding AAA -> @ in the CC's setting(this just move back the m_TokenIndex, and put the Tokenizer in macro replacement mode), you will get a variable Token named "EEE", which is correct.

I see that DoGetToken() can be recurively called, thus recursive macro replacement happens when DoGetToken() is called.

EDIT:
In your patch

--- Code: --- //NOTE: ReplaceFunctionLikeMacro will recursively expand all defines and macro calls,
// but will not expand macro names, which is what we want
smallTokenizer.ReplaceFunctionLikeMacro(tk);

--- End code ---
I think the comment is not correct right? Since ReplacefunctionLikeMacro just do text replacement once.

EDIT2:
If I change to this code

--- Code: ---#define AAA BBB
#define BBB CCC
#define CCC DDD
#define DDD EEE
#define EEE BBB

int AAA;

--- End code ---
You will quickly get an infinite loop, until we run to the limit:

--- Code: --- if (m_RepeatReplaceCount > 0)
{
if (m_RepeatReplaceCount >= s_MaxRepeatReplaceCount)
{
m_TokenIndex = m_BufferLen - m_FirstRemainingLength;
m_PeekAvailable = false;

--- End code ---
To solve this issue, I think
1, collecting the used macros, and don't use it again
2, expand all the macros once, not by recursive call of DoGetToken

ollydbg:

--- Quote from: ollydbg on September 08, 2014, 08:55:18 am ---
EDIT:
In your patch

--- Code: --- //NOTE: ReplaceFunctionLikeMacro will recursively expand all defines and macro calls,
// but will not expand macro names, which is what we want
smallTokenizer.ReplaceFunctionLikeMacro(tk);

--- End code ---

--- End quote ---
I see that when you call ReplaceFunctionLikeMacro(), then the Tokenizer goes to the macro replacement mode, so the next time you call smallTokenizer.GetToken(), macro replacement will recursively expand all.

Huki:
Oh, sorry about the confusion. Regarding macro expansion: for any macro, we have 1) the macro definition, and 2) the macro usage. Each of them can be of two kinds: 1) having parentheses (...), or 2) not having parentheses. So you can see there are totally 4 cases to handle. Then the macro can either be simple, or nested (needs recursive expansion). I will try to explain the different cases one by one.

Case 1: Both macro definition and macro usage do not have (). Eg,

--- Code: ---#define FIVE 5
int a = FIVE;
--- End code ---
We can call this a variable-like macro define / usage.

Case 2: Both macro definition and macro usage have (). Eg,

--- Code: ---#define MUL(_x) (_x * 5)
int a = MUL(1);
--- End code ---
We can call this a function-like macro define / usage.

Case 3: Macro definition does not have () but macro usage does have it. Eg,

--- Code: ---#define MUL(_x) (_x * 5)

#define OPER MUL

// OPER(1) is an example of Case 3 macro usage
int a = OPER(1);
--- End code ---
Here, OPER is defined as a variable-macro, but used as a function-macro. We should expand it like a variable-macro using plain ReplaceBufferText(). In my patch I have added support for this case (the test added at the beginning of GetMacroExpendedText()).

Case 4: Macro definition has the () but macro usage does not have it. Eg,

--- Code: ---#define MUL(_x) (_x * 5)

// MUL is an example of Case 4 macro usage
#define OPER MUL // this line is valid (but CC should not expand MUL)

// these two lines are valid
int a = OPER(1);
int b = MUL(1);

int c = MUL; // this line is invalid (CC should not expand MUL)
--- End code ---

Here, MUL is defined as a function-like macro. But in the line "#define OPER MUL" it is used without parentheses. This means our CC engine should treat it like an error and leave it alone (do not expand it). We can use this behavior to stop our recursive expansion at this point, and use the final result "MUL" for the calltips.

--- Quote from: ollydbg ---It looks like this code above is valid, and I just create a simple test code, and it build fine under MinGW G++
[...]
--- End quote ---
Yes, this is valid, but still we should not expand it (we need to stop there and use the MUL token for our calltips). Imagine if we expand it:

--- Code: ---#define OPER MUL
// becomes
#define OPER (_x * 5) // invalid because OPER cannot accept any arguments!
--- End code ---
So generally our macro expansion code should always treat Case 4 like an error and don't expand it...

That is the reason for this test added:

--- Code: ---@@ -1954,16 +1962,26 @@ bool Tokenizer::GetMacroExpendedText(const Token* tk, wxString& expandedText)
[...]
+ // don't replace anything if the [actual (or usage)] arguments are missing
+ if (!SplitArguments(actualArgs))
return false;
--- End code ---

I hope that is clear now.

--- Quote from: ollydbg ---I see that the only usage of the GetMacroExpendedText() is here: (BTW: typo here? should be "expanded text"?)
--- End quote ---
Yes, here is the only usage of this function. GetMacroExpendedText() should return "true" if we want ReplaceBufferText(), and "false" if we want to leave the macro alone and don't replace. But even if there is empty () in the macro definition we have to handle it in GetMacroExpendedText() before returning true.
And yes, it should be "expanded text". :)

--- Quote from: ollydbg ---It looks like m_Args is used to distinguish whether a macro definition is variable-like or function-like, but if I realize that even function-like macro definitions are allowed to use empty formal argument list, so the condition is not true any more.
--- End quote ---
Yes, but m_Args is not empty even if there is empty parentheses. So we can safely check for m_Args->IsEmpty(). A function-macro like this: "#define MACRO() ..." will have in m_Args: "()". Only for variable-macro the m_Args is really empty.

--- Quote from: ollydbg ---BTW: it looks like ReplaceFunctionLikeMacro() only runs one level replacement, if you have several replacement rules (A -> B ->C), call this function only make a (A->B), but I think (B->C) will be happens after some GetToken() call.
--- End quote ---

--- Quote from: ollydbg ---I think the comment is not correct right? Since ReplacefunctionLikeMacro just do text replacement once.
--- End quote ---
Well, yes, the actual recursive expansion happens in the next GetToken() call, but I wanted to keep the comments simple and easy to read. Technically, ReplacefunctionLikeMacro() expands one level, and turns on the flag for further expansion in the next GetToken() or PeekToken().
In my patch there is:

--- Code: ---+ smallTokenizer.ReplaceFunctionLikeMacro(tk);
+ tk = tree->at(tree->TokenExists(smallTokenizer.GetToken(), ...
--- End code ---
So, full recursive expansion occurs in the GetToken(), but when we hit Case 4, GetMacroExpendedText() will return false, and it won't be expanded further. We use that result for the calltips.

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version