I found that the parser stopped parsing in the comment blocks, and only two variables( int a, int b) were recognized.
int a;
int b;
///remove "//" or "/*" blocks
int c;
int d;
int main()
{
cout << "Hello world!" << endl;
return 0;
}
thanks :D
I found that the parser stopped parsing in the comment blocks, and only two variables( int a, int b) were recognized.
That is correct. C::B recognises the /* as beginning of a multi-line comment which never ends.
Why do we search for nested braces or other comments, while we are skipping to EndOfLine indside a comment ?.
As far as I know, there are only two things that can end a c++-style comment: a newline or EOF.
In other words, what about just "eating" all chars until EOL or EOF?
A patch can look like this:
Index: src/plugins/codecompletion/parser/tokenizer.cpp
===================================================================
--- src/plugins/codecompletion/parser/tokenizer.cpp (Revision 5482)
+++ src/plugins/codecompletion/parser/tokenizer.cpp (Arbeitskopie)
@@ -279,18 +279,21 @@
{
while (NotEOF() && CurrentChar() != '\n')
{
- if (CurrentChar() == '/' && NextChar() == '*')
+ if(!skippingComment)
{
- SkipComment(false); // don't skip whitespace after the comment
- if (skippingComment && CurrentChar() == '\n')
+ if (CurrentChar() == '/' && NextChar() == '*')
{
- continue; // early exit from the loop
+ SkipComment(false); // don't skip whitespace after the comment
+ if (skippingComment && CurrentChar() == '\n')
+ {
+ continue; // early exit from the loop
+ }
}
+ if (nestBraces && CurrentChar() == _T('{'))
+ ++m_NestLevel;
+ else if (nestBraces && CurrentChar() == _T('}'))
+ --m_NestLevel;
}
- if (nestBraces && CurrentChar() == _T('{'))
- ++m_NestLevel;
- else if (nestBraces && CurrentChar() == _T('}'))
- --m_NestLevel;
MoveToNextChar();
}
wxChar last = PreviousChar();
The patch looks a little bit "unclear", but this is how TortoiseSVN handles changed indendation, at least on my XP.
In fact I only added one if-clause with two braces.
If I have overseen or totally missing something, please correct me.
Shouldn't it be
while (NotEOF() && CurrentChar() != '\n' && CurrentChar() != '\r')
for Mac users ?
Dje
In other words, what about just "eating" all chars until EOL or EOF?
Nope - won't work. Consider this:
void MyFun(bool myParam /* = true */, int MyOtherParam /* = 0 */)
{
int a /* could be b */ = 1; /* probably 0 */
int b; /* Descr:
* Nice!
*/ return;
string "hello
world";
}
...unless I am missing something...
(Will try the patch though...)
Something else that came into my mind just by now:
Why don't we kind of "pre-process" the buffer before CC analyses it in term of removing comments completely. I mean: Commented stuff is just useless for CC (unless we want to consider using Doxygen comments or alike) and probably operating the whole buffer could work with a "simple" RegEx?! In the end we would obsolete a lot of comment checking code.
Comment checking code is not quite difficult.
I checked the source code. Each time we call Tokenzier::DoGetToken(), it will first try to strip the comment
wxString Tokenizer::DoGetToken()
{
if (IsEOF())
return wxEmptyString;
if (!SkipWhiteSpace())
return wxEmptyString;
if (m_SkipUnwantedTokens && !SkipUnwanted()) // ****************Here
return wxEmptyString;
// if m_SkipUnwantedTokens is false, we need to handle comments here too
if (!m_SkipUnwantedTokens)
SkipComment(); //*****************Here
........
If m_SkipUnwantedTokens is true (which is a normal situation), then the SkipComment() will be called in the SkipUnwanted() function.
If m_SkipUnwantedTokens is false (which means we are in a special situation, such as eating the argument of a templates, we shouldn't call SkipUnwanted(), then SkipComment() will be called manually).
With all the steps before, I think Comment can be stripped quite well. :D Am I right? If wrong, please correct me. Thank you!