Developer forums (C::B DEVELOPMENT STRICTLY!) > CodeCompletion redesign
Is it possible for the parser to support newlib prototypes?
Huki:
--- Quote from: Huki on September 17, 2014, 12:10:32 am ---
--- Quote from: White-Tiger on September 14, 2014, 04:30:30 pm ---yes an no.. the reported case works, same for the real world code I based it on... (at least it looks like that, didn't miss any function)
But another case I've used to work with, does not work.
I'll post the almost non-stripped code just in case there's another "bug" (had to strip it a bit because of 20k character limit for posts^^)
...
Basically the problem already appears at the first function... at "getClientLibVersion"... it's detected... but only as
public int(* getClientLibVersion): unsignedSo there's a problem with unsigned int as a return because it only picks up the first identifier...
--- End quote ---
--- Quote from: ollydbg on September 15, 2014, 08:29:24 am ---Candidate patch to fix your problem
[...]
--- End quote ---
Hi, yes, we need to handle return types with more than one token, like "unsigned int". Thanks for the patch, I'll review it and the macros handling part tomorrow.
--- End quote ---
I think your patch will fix the problem, but maybe it's better to have all function pointers checking in one place (in DoParse()), then we just send the result to HandleFunction(). The problem is that we have a pattern like: AAA BBB (*name) (arg), where m_Str = AAA, token = BBB, peek = (*name), and we can't know if this is a function declaration or function ptr without reading the next token after peek.
But I think we can use another trick: strip the '(' in peek, and see if the next char is '*'. If it is, then it should be a function pointer. See this code:
--- Code: ---// pattern unsigned int (*getClientLibVersion)(char** result);
// currently, m_Str = unsigned, token = int, peek = (*getClientLibVersion)
// this may be a function pointer declaration, we can guess without
// reading the next token, if "peek" has a ptr char and only 1 argument
// in it.
// see what is inside the (...)
// try to see whether the peek pattern is (* BBB)
wxString arg = peek;
arg.Remove(0,1); // remove '('
if (arg.GetChar(0) == ParserConsts::ptr)
{
arg.RemoveLast();
arg.Remove(0,1).Trim(false); // remove '*'
m_Str << token;
token = m_Tokenizer.GetToken(); //consume the peek
// BBB is now the function ptr's name
HandleFunction(/*function name*/ arg,
/*isOperator*/ false,
/*isPointer*/ true);
}
else if (!m_Options.useBuffer || m_Options.bufferSkipBlocks)
// function declaration
else
// local variable initialized with ctor
--- End code ---
Btw, some notes about the function pointer handling:
1) I saw comments like: // *BBB is now the function ptr's name
In fact we strip the '*', so the function name is just BBB.
2) About the spaces trimming: I checked the ReadParantheses() function (used in DoGetToken()), and I see it guarantees that there is no space immediately after the '(' or before the ')'. So the only place we might have to trim spaces is after removing the '*' (eg, in (* BBB) there is a space before BBB).
So I think all other Trim() calls can be safely removed, like this:
Old code:
--- Code: ---arg.Trim(true).RemoveLast();
arg.Remove(0, pos+1);
arg.Trim(true).Trim(false);
--- End code ---
New code:
--- Code: ---arg.RemoveLast();
arg.Remove(0, pos+1).Trim(false);
--- End code ---
I quickly tested it with some spaces and here is the result.. :)
Huki:
--- Quote ---
--- Quote from: ollydbg on September 12, 2014, 04:15:54 am ---BTW: by the way, maybe, the two condition:
[...]
Those two conditions can be merged or some refactored, but I'm not quite sure. E.g. extract the handling macro usage, and merge handling of function decl or function ptr in one condition. This can be a new commit. :D
--- End quote ---
I agree we can separate the macro handling and merge the function handling.
We can think about supporting more cases for macro handling too. We currently handle function-like macros, and only when m_Str is empty,
--- End quote ---
I have finally gotten around to doing it. See the result below.. :)
I'm pasting the entire code for opbracket_chr and macro defines, in DoParse(). Also, now all macros are expanded in DoParse(), but only if we reached "if (!switchHandled)".
--- Code: ---[...]
else if (!switchHandled)
{
// since we can't recognize the pattern by token, then the token
// is normally an identifier style lexme, now we try to peek the next token
wxString peek = m_Tokenizer.PeekToken();
if (!peek.IsEmpty())
{
// pattern: AAA or AAA (...)
int id = m_TokenTree->TokenExists(token, -1, tkMacroDef);
// if AAA is a macro definition, then expand this macro
if (id != -1)
{
HandleMacroExpansion(id, peek);
}
// any function like pattern
else if ( (peek.GetChar(0) == ParserConsts::opbracket_chr)
&& m_Options.handleFunctions )
{
if ( m_Str.IsEmpty()
&& m_EncounteredNamespaces.empty()
&& m_EncounteredTypeNamespaces.empty()
&& (!m_LastParent || m_LastParent->m_Name != token) ) // if func has same name as current scope (class)
{
// see what is inside the (...)
wxString arg = m_Tokenizer.GetToken(); // eat args ()
// try to see whether the peek pattern is (* BBB)
int pos = peek.find(ParserConsts::ptr);
if (pos != wxNOT_FOUND)
{
peek = m_Tokenizer.PeekToken();
if (peek.GetChar(0) == ParserConsts::opbracket_chr)
{
// pattern: AAA (* BBB) (...)
// where peek is (...) and arg is (* BBB)
arg.RemoveLast();
arg.Remove(0, pos+1).Trim(false);
// NOTE: support func ptr in local block, show return type.
// if (!m_Options.useBuffer || m_Options.bufferSkipBlocks)
// HandleFunction(arg); // function
// AAA now becomes the last element of stacked type string
// which is the return type of function ptr
m_Str << token << ParserConsts::space_chr;
// BBB is now the function ptr's name
HandleFunction(/*function name*/ arg,
/*isOperator*/ false,
/*isPointer*/ true);
m_Str.Clear();
}
}
else // wxString arg = m_Tokenizer.GetToken(); // eat args ()
m_Str = token + arg;
}
// NOTE: support some more cases..., such as m_Str is not empty
// if (!m_Options.useBuffer || m_Options.bufferSkipBlocks)
// HandleFunction(token); // function
// else
// m_Tokenizer.GetToken(); // eat args when parsing block
// function ptr with pointer return type
// eg: void *(*Alloc)(void *p, size_t size);
// where, m_Str=void, token=(*Alloc), peek=(void *p, size_t size)
else if ( (m_LastToken == ParserConsts::ptr_chr) //(m_PointerOrRef)
&& (token.GetChar(0) == ParserConsts::opbracket_chr) )
{
int pos = token.find(ParserConsts::ptr);
if (pos != wxNOT_FOUND)
{
wxString arg = token;
arg.RemoveLast();
arg.Remove(0, pos+1).Trim(false);
HandleFunction(/*function name*/ arg,
/*isOperator*/ false,
/*isPointer*/ true);
m_Str.Clear();
}
}
else
{
// pattern unsigned int (*getClientLibVersion)(char** result);
// currently, m_Str = unsigned, token = int, peek = (*getClientLibVersion)
// this may be a function pointer declaration, we can guess without
// reading the next token, if "peek" has a ptr char and only 1 argument
// in it.
// see what is inside the (...)
// try to see whether the peek pattern is (* BBB)
wxString arg = peek;
arg.Remove(0,1); // remove '('
if (arg.GetChar(0) == ParserConsts::ptr)
{
arg.RemoveLast();
arg.Remove(0,1).Trim(false); // remove '*'
m_Str << token;
token = m_Tokenizer.GetToken(); //consume the peek
// BBB is now the function ptr's name
HandleFunction(/*function name*/ arg,
/*isOperator*/ false,
/*isPointer*/ true);
}
else if (!m_Options.useBuffer || m_Options.bufferSkipBlocks)
{
// pattern AAA BBB (...) in global namespace (not in local block)
// so, this is mostly like a function declaration, but in-fact this
// can also be a global variable initializized with ctor, but for
// simplicity, we drop the later case
HandleFunction(token); // function
}
else
{
// local variables initialized with ctor
if (!m_Str.IsEmpty() && m_Options.handleVars)
{
Token* newToken = DoAddToken(tkVariable, token, m_Tokenizer.GetLineNumber());
if (newToken && !m_TemplateArgument.IsEmpty())
ResolveTemplateArgs(newToken);
}
m_Tokenizer.GetToken(); // eat args when parsing block
}
m_Str.Clear();
}
}
else if [...]
--- End code ---
EDIT: and here is the patch:
--- Code: ---From d3f194380a8cdc0eeae06708ece5cf06dd61dd32 Mon Sep 17 00:00:00 2001
From: huki <gk7huki@gmail.com>
Date: Thu, 18 Sep 2014 18:26:08 +0530
Subject: CC: merge function handling and update macro handling
---
src/plugins/codecompletion/parser/parserthread.cpp | 104 ++++++++++++---------
1 file changed, 61 insertions(+), 43 deletions(-)
diff --git a/src/plugins/codecompletion/parser/parserthread.cpp b/src/plugins/codecompletion/parser/parserthread.cpp
index e1f258d..fc3c409 100644
--- a/src/plugins/codecompletion/parser/parserthread.cpp
+++ b/src/plugins/codecompletion/parser/parserthread.cpp
@@ -1009,22 +1009,21 @@ void ParserThread::DoParse()
wxString peek = m_Tokenizer.PeekToken();
if (!peek.IsEmpty())
{
- if ( (peek.GetChar(0) == ParserConsts::opbracket_chr)
- && m_Options.handleFunctions
- && m_Str.IsEmpty()
- && m_EncounteredNamespaces.empty()
- && m_EncounteredTypeNamespaces.empty()
- && (!m_LastParent || m_LastParent->m_Name != token) ) // if func has same name as current scope (class)
+ // pattern: AAA or AAA (...)
+ int id = m_TokenTree->TokenExists(token, -1, tkMacroDef);
+ // if AAA is a macro definition, then expand this macro
+ if (id != -1)
{
- // pattern: AAA (...)
- int id = m_TokenTree->TokenExists(token, -1, tkMacroDef);
- // if AAA is a macro definition, then expand this macro
- if (id != -1)
- {
- HandleMacroExpansion(id, peek);
- m_Str.Clear();
- }
- else
+ HandleMacroExpansion(id, peek);
+ }
+ // any function like pattern
+ else if ( (peek.GetChar(0) == ParserConsts::opbracket_chr)
+ && m_Options.handleFunctions )
+ {
+ if ( m_Str.IsEmpty()
+ && m_EncounteredNamespaces.empty()
+ && m_EncounteredTypeNamespaces.empty()
+ && (!m_LastParent || m_LastParent->m_Name != token) ) // if func has same name as current scope (class)
{
// see what is inside the (...)
wxString arg = m_Tokenizer.GetToken(); // eat args ()
@@ -1037,16 +1036,15 @@ void ParserThread::DoParse()
{
// pattern: AAA (* BBB) (...)
// where peek is (...) and arg is (* BBB)
- arg.Trim(true).RemoveLast();
- arg.Remove(0, pos+1);
- arg.Trim(true).Trim(false);
+ arg.RemoveLast();
+ arg.Remove(0, pos+1).Trim(false);
// NOTE: support func ptr in local block, show return type.
// if (!m_Options.useBuffer || m_Options.bufferSkipBlocks)
// HandleFunction(arg); // function
// AAA now becomes the last element of stacked type string
// which is the return type of function ptr
m_Str << token << ParserConsts::space_chr;
- // * BBB is now the function ptr's name
+ // BBB is now the function ptr's name
HandleFunction(/*function name*/ arg,
/*isOperator*/ false,
/*isPointer*/ true);
@@ -1056,9 +1054,6 @@ void ParserThread::DoParse()
else // wxString arg = m_Tokenizer.GetToken(); // eat args ()
m_Str = token + arg;
}
- }
- else if (peek.GetChar(0) == ParserConsts::opbracket_chr && m_Options.handleFunctions)
- {
// NOTE: support some more cases..., such as m_Str is not empty
// if (!m_Options.useBuffer || m_Options.bufferSkipBlocks)
// HandleFunction(token); // function
@@ -1068,42 +1063,65 @@ void ParserThread::DoParse()
// function ptr with pointer return type
// eg: void *(*Alloc)(void *p, size_t size);
// where, m_Str=void, token=(*Alloc), peek=(void *p, size_t size)
- if ( (m_LastToken == ParserConsts::ref_chr || m_LastToken == ParserConsts::ptr_chr) // (m_PointerOrRef)
- && (token.GetChar(0) == ParserConsts::opbracket_chr))
+ else if ( (m_LastToken == ParserConsts::ptr_chr) //(m_PointerOrRef)
+ && (token.GetChar(0) == ParserConsts::opbracket_chr) )
{
int pos = token.find(ParserConsts::ptr);
if (pos != wxNOT_FOUND)
{
wxString arg = token;
- arg.Trim(true).RemoveLast();
- arg.Remove(0, pos+1);
- arg.Trim(true).Trim(false);
+ arg.RemoveLast();
+ arg.Remove(0, pos+1).Trim(false);
HandleFunction(/*function name*/ arg,
/*isOperator*/ false,
/*isPointer*/ true);
+ m_Str.Clear();
}
}
- else if (!m_Options.useBuffer || m_Options.bufferSkipBlocks)
- {
- // pattern AAA BBB (...) in global namespace (not in local block)
- // so, this is mostly like a function declaration, but in-fact this
- // can also be a global variable initializized with ctor, but for
- // simplicity, we drop the later case
- HandleFunction(token); // function
- }
else
{
- // local variables initialized with ctor
- if (!m_Str.IsEmpty() && m_Options.handleVars)
+ // pattern unsigned int (*getClientLibVersion)(char** result);
+ // currently, m_Str = unsigned, token = int, peek = (*getClientLibVersion)
+ // this may be a function pointer declaration, we can guess without
+ // reading the next token, if "peek" has a ptr char and only 1 argument
+ // in it.
+
+ // see what is inside the (...)
+ // try to see whether the peek pattern is (* BBB)
+ wxString arg = peek;
+ arg.Remove(0,1); // remove '('
+ if (arg.GetChar(0) == ParserConsts::ptr)
{
- Token* newToken = DoAddToken(tkVariable, token, m_Tokenizer.GetLineNumber());
- if (newToken && !m_TemplateArgument.IsEmpty())
- ResolveTemplateArgs(newToken);
+ arg.RemoveLast();
+ arg.Remove(0,1).Trim(false); // remove '*'
+ m_Str << token;
+ token = m_Tokenizer.GetToken(); //consume the peek
+ // BBB is now the function ptr's name
+ HandleFunction(/*function name*/ arg,
+ /*isOperator*/ false,
+ /*isPointer*/ true);
+ }
+ else if (!m_Options.useBuffer || m_Options.bufferSkipBlocks)
+ {
+ // pattern AAA BBB (...) in global namespace (not in local block)
+ // so, this is mostly like a function declaration, but in-fact this
+ // can also be a global variable initializized with ctor, but for
+ // simplicity, we drop the later case
+ HandleFunction(token); // function
}
- m_Tokenizer.GetToken(); // eat args when parsing block
+ else
+ {
+ // local variables initialized with ctor
+ if (!m_Str.IsEmpty() && m_Options.handleVars)
+ {
+ Token* newToken = DoAddToken(tkVariable, token, m_Tokenizer.GetLineNumber());
+ if (newToken && !m_TemplateArgument.IsEmpty())
+ ResolveTemplateArgs(newToken);
+ }
+ m_Tokenizer.GetToken(); // eat args when parsing block
+ }
+ m_Str.Clear();
}
-
- m_Str.Clear();
}
else if ( (peek == ParserConsts::colon)
&& (token != ParserConsts::kw_private)
--
1.9.4.msysgit.0
--- End code ---
Huki:
--- Quote from: ollydbg on September 15, 2014, 07:00:24 am ---Maybe, a better method is to try expand every identifier like token if possible, I remembered you have a patch named "cc_parser_general.patch", but when I looked at that patch in my PC, I see that patch contains too many things. (a lot of them is already in trunk)
To enable this feature, we can just do this patch:
--- Code: --- src/plugins/codecompletion/parser/tokenizer.cpp | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/src/plugins/codecompletion/parser/tokenizer.cpp b/src/plugins/codecompletion/parser/tokenizer.cpp
index f47e8dd..31f0546 100644
--- a/src/plugins/codecompletion/parser/tokenizer.cpp
+++ b/src/plugins/codecompletion/parser/tokenizer.cpp
@@ -1246,8 +1246,8 @@ wxString Tokenizer::DoGetToken()
void Tokenizer::ReplaceMacro(wxString& str)
{
// this indicates we are already in macro replacement mode
- if (m_RepeatReplaceCount > 0)
- {
+// if (m_RepeatReplaceCount > 0)
+// {
const int id = m_TokenTree->TokenExists(str, -1, tkMacroDef);
if (id != -1)
{
@@ -1270,8 +1270,9 @@ void Tokenizer::ReplaceMacro(wxString& str)
// if in macro expansion mode, we don't want to let the user replacement rule executed
// again, so just returned
return;
- }
+// }
+#if 0
wxStringHashMap::const_iterator it = s_Replacements.find(str);
if (it == s_Replacements.end())
return;
@@ -1344,6 +1345,7 @@ void Tokenizer::ReplaceMacro(wxString& str)
if (it->second != str && ReplaceBufferText(it->second, false))
str = DoGetToken();
}
+#endif
}
bool Tokenizer::CalcConditionExpression()
--- End code ---
Thus, we totally remove all the user defined replacement rules.
I just test the patch, the parsing time is a bit longer, but not too much. ;D
--- End quote ---
Sure, I think we can try it, but better to keep it for another commit. ;)
--- Quote ---Another smart method is that we can only check the macro usage on the identifier like token which has all capital characters or underscore.
--- End quote ---
Maybe it won't be required unless the parsing speed is too bad, but we will see..
ollydbg:
--- Quote from: Huki on September 18, 2014, 02:15:14 pm ---I think your patch will fix the problem, but maybe it's better to have all function pointers checking in one place (in DoParse()), then we just send the result to HandleFunction(). The problem is that we have a pattern like: AAA BBB (*name) (arg), where m_Str = AAA, token = BBB, peek = (*name), and we can't know if this is a function declaration or function ptr without reading the next token after peek.
But I think we can use another trick: strip the '(' in peek, and see if the next char is '*'. If it is, then it should be a function pointer. See this code:
--- Code: ---// pattern unsigned int (*getClientLibVersion)(char** result);
// currently, m_Str = unsigned, token = int, peek = (*getClientLibVersion)
// this may be a function pointer declaration, we can guess without
// reading the next token, if "peek" has a ptr char and only 1 argument
// in it.
// see what is inside the (...)
// try to see whether the peek pattern is (* BBB)
wxString arg = peek;
arg.Remove(0,1); // remove '('
if (arg.GetChar(0) == ParserConsts::ptr)
{
arg.RemoveLast();
arg.Remove(0,1).Trim(false); // remove '*'
m_Str << token;
token = m_Tokenizer.GetToken(); //consume the peek
// BBB is now the function ptr's name
HandleFunction(/*function name*/ arg,
/*isOperator*/ false,
/*isPointer*/ true);
}
else if (!m_Options.useBuffer || m_Options.bufferSkipBlocks)
// function declaration
else
// local variable initialized with ctor
--- End code ---
--- End quote ---
Correct, we should recognize the function patter in DoParse(), not HandleFunction.
--- Quote ---Btw, some notes about the function pointer handling:
1) I saw comments like: // *BBB is now the function ptr's name
In fact we strip the '*', so the function name is just BBB.
--- End quote ---
Correct, thanks.
--- Quote ---2) About the spaces trimming: I checked the ReadParantheses() function (used in DoGetToken()), and I see it guarantees that there is no space immediately after the '(' or before the ')'. So the only place we might have to trim spaces is after removing the '*' (eg, in (* BBB) there is a space before BBB).
So I think all other Trim() calls can be safely removed, like this:
Old code:
--- Code: ---arg.Trim(true).RemoveLast();
arg.Remove(0, pos+1);
arg.Trim(true).Trim(false);
--- End code ---
New code:
--- Code: ---arg.RemoveLast();
arg.Remove(0, pos+1).Trim(false);
--- End code ---
...
--- End quote ---
You are right, thanks.
--- Quote from: Huki on September 18, 2014, 02:28:31 pm ---
--- Quote ---
--- Quote from: ollydbg on September 12, 2014, 04:15:54 am ---BTW: by the way, maybe, the two condition:
[...]
Those two conditions can be merged or some refactored, but I'm not quite sure. E.g. extract the handling macro usage, and merge handling of function decl or function ptr in one condition. This can be a new commit. :D
--- End quote ---
I agree we can separate the macro handling and merge the function handling.
We can think about supporting more cases for macro handling too. We currently handle function-like macros, and only when m_Str is empty,
--- End quote ---
I have finally gotten around to doing it. See the result below.. :)
...
EDIT: and here is the patch:
...
--- End quote ---
I'm testing your patch now. It looks like macro expansion is enabled on every identifier like tokens.
Huki:
--- Quote from: ollydbg on September 21, 2014, 09:32:48 am ---
--- Quote from: Huki on September 18, 2014, 02:15:14 pm ---I think your patch will fix the problem, but maybe it's better to have all function pointers checking in one place (in DoParse()), then we just send the result to HandleFunction(). The problem is that we have a pattern like: AAA BBB (*name) (arg), where m_Str = AAA, token = BBB, peek = (*name), and we can't know if this is a function declaration or function ptr without reading the next token after peek.
But I think we can use another trick: strip the '(' in peek, and see if the next char is '*'. If it is, then it should be a function pointer. See this code:
--- Code: ---// pattern unsigned int (*getClientLibVersion)(char** result);
// currently, m_Str = unsigned, token = int, peek = (*getClientLibVersion)
// this may be a function pointer declaration, we can guess without
// reading the next token, if "peek" has a ptr char and only 1 argument
// in it.
// see what is inside the (...)
// try to see whether the peek pattern is (* BBB)
wxString arg = peek;
arg.Remove(0,1); // remove '('
if (arg.GetChar(0) == ParserConsts::ptr)
{
arg.RemoveLast();
arg.Remove(0,1).Trim(false); // remove '*'
m_Str << token;
token = m_Tokenizer.GetToken(); //consume the peek
// BBB is now the function ptr's name
HandleFunction(/*function name*/ arg,
/*isOperator*/ false,
/*isPointer*/ true);
}
else if (!m_Options.useBuffer || m_Options.bufferSkipBlocks)
// function declaration
else
// local variable initialized with ctor
--- End code ---
--- End quote ---
Correct, we should recognize the function patter in DoParse(), not HandleFunction.
--- End quote ---
On second look I noticed we can do a little optimization. This code:
--- Code: ---wxString arg = peek;
arg.Remove(0,1); // remove '('
if (arg.GetChar(0) == ParserConsts::ptr)
{
arg.Remove(0,1).Trim(false); // remove '*'
[...]
}
--- End code ---
can be changed to:
--- Code: ---if (peek.GetChar(1) == ParserConsts::ptr)
{
wxString arg = peek;
arg.Remove(0,2).Trim(false); // remove "(*"
[...]
}
--- End code ---
So the entire code for this case:
--- Code: ---// see what is inside the (...)
// try to see whether the peek pattern is (* BBB)
if (peek.GetChar(1) == ParserConsts::ptr)
{
wxString arg = peek;
arg.RemoveLast(); // remove ")"
arg.Remove(0,2).Trim(false); // remove "(* "
m_Str << token;
token = m_Tokenizer.GetToken(); //consume the peek
// BBB is now the function ptr's name
HandleFunction(/*function name*/ arg,
/*isOperator*/ false,
/*isPointer*/ true);
}
else if [...]
--- End code ---
That way we don't have to create a temporary variable 'arg' every time when we see a function declaration-like pattern, but only when we know for sure it can be a function pointer.
--- Quote ---I'm testing your patch now. It looks like macro expansion is enabled on every identifier like tokens.
--- End quote ---
Yes, when the (!switchHandled) case is reached in DoParse(), it now expands all macros, including variable-like ones.
Navigation
[0] Message Index
[#] Next page
[*] Previous page
Go to full version