Author Topic: The 25 september 2010 build (6634) CODECOMPLETION BRANCH version is out.  (Read 115521 times)

Offline Loaden

  • Lives here!
  • ****
  • Posts: 1014
Re: The 25 september 2010 build (6634) CODECOMPLETION BRANCH version is out.
« Reply #60 on: October 06, 2010, 10:28:33 am »
Works now:
Code
Project 'test' parsing stage done (30453 total parsed files, 1347032 tokens in 42 minute(s), 24.656 seconds).

Hi, jens, Did you remember how long will the batch parse take to parse the whole Linux source, when you did this last time(If I remember correct, maybe one year ago)?

No, and it would not give any answers about incfreasing or decreasing parse time, because this time I parsed 2.6.35 kernel sources, the last time it was 2.6.29 with only 21k files and a little more than a million tokens, but I will test with current trunk and give you feedback.
There may be a large number of duplicate tokens, tokens will not be in direct proportion with the time.

Offline polygon7

  • Multiple posting newcomer
  • *
  • Posts: 104
    • Home site
Re: The 25 september 2010 build (6634) CODECOMPLETION BRANCH version is out.
« Reply #61 on: October 06, 2010, 02:47:53 pm »
Hi,
I'm testing C::B CC_branch on another big project - OpenOffice and after ~20 minutes CC parser is hanging on:

Code
InitTokenizer() : m_Filename='/usr/include/bits/time.h', m_FileSize=2866.
Parse() : Parsing '/usr/include/bits/time.h'
DoParse() : Loop:m_Str='', token='#'
DoParse() : Loop:m_Str='', token='#'
DoAddToken() : Created token='_STRUCT_TIMEVAL', file_idx=12732, line=70, ticket=
GetActualTokenType() : Searching within m_Str='1'
GetActualTokenType() : Compensated m_Str='1'
GetActualTokenType() : Found '1'
DoAddToken() : Prepending ''
DoParse() : Loop:m_Str='', token='#'
HandleIncludes() : Found include file 'bits/types.h'
DoParse() : Loop:m_Str='', token='struct'
HandleClass() : Found class 'timeval'
DoAddToken() : Created token='timeval', file_idx=12732, line=75, ticket=
GetActualTokenType() : Searching within m_Str=''
GetActualTokenType() : Compensated m_Str=''
GetActualTokenType() : Returning ''
DoAddToken() : Prepending ''
DoAddToken() : Added/updated token 'timeval' (353789), type '', actual ''. Parent is  (-1)
DoParse() : Loop:m_Str='', token='__time_t'
DoParse() : Loop:m_Str='__time_t ', token='tv_sec'
DoAddToken() : Created token='tv_sec', file_idx=12732, line=77, ticket=
GetActualTokenType() : Searching within m_Str='__time_t'
GetActualTokenType() : Compensated m_Str='__time_t'
GetActualTokenType() : Found '__time_t'
DoAddToken() : Prepending ''
DoAddToken() : Added/updated token 'tv_sec' (353790), type '__time_t', actual '__time_t'. Parent is timeval (353789)
DoParse() : Loop:m_Str='__time_t', token=';'
DoParse() : Loop:m_Str='', token='__suseconds_t'
DoParse() : Loop:m_Str='__suseconds_t ', token='tv_usec'
DoAddToken() : Created token='tv_usec', file_idx=12732, line=78, ticket=
GetActualTokenType() : Searching within m_Str='__suseconds_t'
GetActualTokenType() : Compensated m_Str='__suseconds_t'
GetActualTokenType() : Found '__suseconds_t'
DoAddToken() : Prepending ''
DoAddToken() : Added/updated token 'tv_usec' (353791), type '__suseconds_t', actual '__suseconds_t'. Parent is timeval (353789)
DoParse() : Loop:m_Str='__suseconds_t', token=';'
DoParse() : Loop:m_Str='', token='}'
InitTokenizer() : m_Filename='/usr/include/bits/dirent.h', m_FileSize=1609.
Parse() : Parsing '/usr/include/bits/dirent.h'
DoParse() : Loop:m_Str='', token='struct'
HandleClass() : Found class 'dirent'
C++ Parser is still parsing files...
C++ Parser is still parsing files...
C++ Parser is still parsing files...

Code
Tasks: 143 total,   1 running, 142 sleeping,   0 stopped,   0 zombie
Cpu(s): 60.8%us,  3.0%sy,  0.0%ni, 36.0%id,  0.2%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   2062772k total,  1969520k used,    93252k free,   145608k buffers
Swap:  2104476k total,     9292k used,  2095184k free,   664664k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                                                                            
15947 1234   20   0  824m 573m  34m S  102 28.5  19:55.35 codeblocks


// EDIT: C::B CC branch rev 6671
best regards,
p7
 Free open source UML modeling tool: ArgoUML

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5906
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: The 25 september 2010 build (6634) CODECOMPLETION BRANCH version is out.
« Reply #62 on: October 06, 2010, 02:53:19 pm »
@polygon7
from the log, I can't find the hint where there is a infinite loop, it seem the token line becomes bigger and bigger.

But there should be some thing wrong in Parserthread, we will check it.
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline Jenna

  • Administrator
  • Lives here!
  • *****
  • Posts: 7255
Re: The 25 september 2010 build (6634) CODECOMPLETION BRANCH version is out.
« Reply #63 on: October 06, 2010, 03:03:29 pm »
Without my changes (static variable for deug and some if-clauses in define) I can no longer use C::B to parse the 2.6.35 kernel.
If I run it from commandline, it eats up all my memory (up to ~4GB) and after some tome it crashes with an X-window error (resource temproary unavailable), if I run it through debugger it crashes with a segfault in wxPostEVent, if it tries to send a TaskDone-event.

Offline Jenna

  • Administrator
  • Lives here!
  • *****
  • Posts: 7255
Re: The 25 september 2010 build (6634) CODECOMPLETION BRANCH version is out.
« Reply #64 on: October 06, 2010, 03:05:30 pm »
Here are the results of measuring and the last lines of debug-output before it crashes:

Code
trunk:
[...]
33115 files loaded
Done loading project in 84084ms
[...]
Parsing stage done (33919 total parsed files, 1193078 tokens in 1 minute(s), 59.116 seconds).


"Pure" cc:
[...]
33115 files loaded
Done loading project in 82294ms
[...]

DoAddToken() : Created token='__xl', file_idx=761, line=27, ticket=
GetActualTokenType() : Searching within m_Str='"r0"'
GetActualTokenType() : Compensated m_Str='"r0"'
GetActualTokenType() : Found ''
DoAddToken() : Prepending ''
DoParse() : Loop:m_Str='', token='#'
wxString Tokenizer::ReadToEOL(bool, bool) : line=28, CurrentChar='
', PreviousChar='"', NextChar='#', nestBrace(1)
ReadToEOL(): (END) We are now at line 28, CurrentChar='
', PreviousChar='"', NextChar='#'
ReadToEOL():
DoAddToken() : Created token='__xh', file_idx=761, line=28, ticket=
GetActualTokenType() : Searching within m_Str='"r1"'
GetActualTokenType() : Compensated m_Str='"r1"'
GetActualTokenType() : Found ''
DoAddToken() : Prepending ''
HandleConditionPreprocessor() : #endif at line = 29
bool Tokenizer::SkipToEOL(bool) : line=29, CurrentChar='
', PreviousChar='f', NextChar='
', nestBrace(0)
SkipToEOL(): (END) We are now at line 29, CurrentChar='
', PreviousChar='f', NextChar='
'
DoParse() : Loop:m_Str='', token='#'
ReadParentheses(): (n, base), line=31
wxString Tokenizer::ReadToEOL(bool, bool) : line=31, CurrentChar='      ', PreviousChar=')', NextChar=' ', nestBrace(1)
ReadToEOL(): (END) We are now at line 47, CurrentChar='
', PreviousChar=')', NextChar='
'
ReadToEOL():                                    ({                                                                      register unsigned int __base asm("r4") = base;          register unsigned long long __n asm("r0") = n;          register unsigned long long __res asm("r2");                    register unsigned int __rem asm(__xh);                   asm(    __asmeq("%0", __xh)                                             __asmeq("%1", "r2")                                             __asmeq("%2", "r0")                                             __asmeq("%3", "r4")                                             "bl     __do_div64"                                              : "=r" (__rem), "=r" (__res)                                    : "r" (__n), "r" (__base)                                       : "ip", "lr", "cc");                                    n = __res;                                                      __rem;  })
DoAddToken() : Created token='__do_div_asm', file_idx=761, line=31, ticket=
GetActualTokenType() : Searching within m_Str='                                 ({                                                                      register unsigned int __base asm("r4") = base;          register unsigned long long __n asm("r0") = n;          register unsigned long long __res asm("r2");    register unsigned int __rem asm(__xh);                   asm(    __asmeq("%0", __xh)                                             __asmeq("%1", "r2")                                             __asmeq("%2", "r0")                                             __asmeq("%3", "r4")                                     "bl      __do_div64"                                             : "=r" (__rem), "=r" (__res)                                    : "r" (__n), "r" (__base)                                       : "ip", "lr", "cc");                                    n = __res;                                                      __rem;                                                   })'
GetActualTokenType() : Compensated m_Str='                                      ({                                                                      register unsigned int __base asm("r4") = base;          register unsigned long long __n asm("r0") = n;          register unsigned long long __res asm("r2");    register unsigned int __rem asm(__xh);                   asm(    __asmeq("%0", __xh)                                             __asmeq("%1", "r2")                                             __asmeq("%2", "r0")                                             __asmeq("%3", "r4")                                     "bl      __do_div64"                                             :"=r" (__rem), "=r" (__res)                                     :"r" (__n), "r" (__base)                                        :"ip", "lr", "cc");                                     n = __res;                                                      __rem;                                                   })'
GetActualTokenType() : Found ''
DoAddToken() : Prepending ''
HandleConditionPreprocessor() : #if at line = 49
bool Tokenizer::SkipToEOL(bool) : line=49, CurrentChar=' ', PreviousChar='f', NextChar='_', nestBrace(0)
SkipToEOL(): (END) We are now at line 49, CurrentChar='
', PreviousChar='4', NextChar='
'
CalcConditionExpression() : exp.GetStatus() : 1, exp.GetResult() : 0
HandleConditionPreprocessor() : #elif at line = 61
bool Tokenizer::SkipToEOL(bool) : line=61, CurrentChar=' ', PreviousChar='f', NextChar='_', nestBrace(0)
SkipToEOL(): (END) We are now at line 61, CurrentChar='
', PreviousChar='4', NextChar='
'
CalcConditionExpression() : exp.GetStatus() : 1, exp.GetResult() : 1
DoParse() : Loop:m_Str='', token='#'
HandleIncludes() : Found include file 'asm/bug.h'
DoParse() : Loop:m_Str='', token='#'
ReadParentheses(): (n, base), line=73
wxString Tokenizer::ReadToEOL(bool, bool) : line=73, CurrentChar='      ', PreviousChar=')', NextChar=' ', nestBrace(1)
ReadToEOL(): (END) We are now at line 211, CurrentChar='
', PreviousChar=')', NextChar='
'
 

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5906
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: The 25 september 2010 build (6634) CODECOMPLETION BRANCH version is out.
« Reply #65 on: October 07, 2010, 01:50:43 am »
thanks jens for the test.
From the "pure CC" log file before crash, I can't find any thing wrong, it seems the log was correct (parserthread works fine before crash).
So, the problem should happened before the crash.

@loaden, Maybe, we could add an option to "disable the macro expansion", then see if it still hangs or crash in this case.
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline Jenna

  • Administrator
  • Lives here!
  • *****
  • Posts: 7255
Re: The 25 september 2010 build (6634) CODECOMPLETION BRANCH version is out.
« Reply #66 on: October 07, 2010, 07:20:09 am »
thanks jens for the test.
From the "pure CC" log file before crash, I can't find any thing wrong, it seems the log was correct (parserthread works fine before crash).
So, the problem should happened before the crash.

@loaden, Maybe, we could add an option to "disable the macro expansion", then see if it still hangs or crash in this case.

My guess:
the crash happens, because the applications runs out of memory.

If I run it from commandline, it eats up all my memory (up to ~4GB) and after some tome it crashes with an X-window error (resource temproary unavailable), if I run it through debugger it crashes with a segfault in wxPostEVent, if it tries to send a TaskDone-event.

The crash occurs if I switch to the running C::B, if it's visible and I run it from inside C::B (through gdb) it crashes in wxPostEvent.
Maybe the size of the used buffers should also be traced.

Offline Loaden

  • Lives here!
  • ****
  • Posts: 1014
Re: The 25 september 2010 build (6634) CODECOMPLETION BRANCH version is out.
« Reply #67 on: October 07, 2010, 07:53:21 am »
Maybe the size of the used buffers should also be traced.
Yes, I'll checking carefully.

Offline Loaden

  • Lives here!
  • ****
  • Posts: 1014
Re: The 25 september 2010 build (6634) CODECOMPLETION BRANCH version is out.
« Reply #68 on: October 07, 2010, 03:02:02 pm »
Hi, Jens, could you trying r6676?
I am testing in XP, it seems solved.
Quote
Project 'linux' parsing stage done (27214 total parsed files, 1342720 tokens in 17 minute(s), 37.094 seconds).

Offline Loaden

  • Lives here!
  • ****
  • Posts: 1014
Re: The 25 september 2010 build (6634) CODECOMPLETION BRANCH version is out.
« Reply #69 on: October 07, 2010, 03:29:35 pm »
If set the option "Editor > CC > C/C++ Parser > Parse complex macros" is no checked, will lead the time is less.
Quote
Project 'linux' parsing stage done (27214 total parsed files, 1258851 tokens in 8 minute(s), 54.969 seconds).

Offline Jenna

  • Administrator
  • Lives here!
  • *****
  • Posts: 7255
Re: The 25 september 2010 build (6634) CODECOMPLETION BRANCH version is out.
« Reply #70 on: October 07, 2010, 11:17:22 pm »
linux-kernel 2.6.29
with complex macros:
Code
Project 'test' parsing stage done (21827 total parsed files, 1011919 tokens in 22 minute(s), 2.405 seconds).
without complex macros:
Code
Project 'test' parsing stage done (21827 total parsed files, 951347 tokens in 3 minute(s), 58.764 seconds).

linux-kernel 2.6.35
with complex macros:
Code
Project 'test' parsing stage done (30453 total parsed files, 1343328 tokens in 41 minute(s), 5.657 seconds).
without complex macros and without parsing preprocessor:
Code
Project 'test' parsing stage done (30453 total parsed files, 1259360 tokens in 4 minute(s), 50.464 seconds).

The second time (parsing without) was done immediately after closing the project, so that most of the files are still in the systems hdd-buffer.

Parsing the linux-kernel 2.6.35 only works with debug-output enebaled and redirected to a file (C::B uses up to 2.5 GB of memory and did not release it at project unload).
Using from within C::B or diretcly from console, either crashes C::B or freezes the whole system.

By the way,  the following patch is needed to make C::B compile without a warning if CC_XXX_DEBUG_OUTPUT is set to 2.
I did not change all places (should all be checked I think), only the places which lead to a compiler warning:
Code
Index: src/plugins/codecompletion/parser/token.cpp
===================================================================
--- src/plugins/codecompletion/parser/token.cpp (Revision 6677)
+++ src/plugins/codecompletion/parser/token.cpp (Arbeitskopie)
@@ -1027,10 +1027,10 @@
         }
     }
 
-#if CC_TOKEN_DEBUG_OUTPUT
-    TRACE(_T("RecalcInheritanceChain() : First iteration took : %ld ms"), sw.Time());
-    sw.Start();
-#endif
+//#if CC_TOKEN_DEBUG_OUTPUT
+//    TRACE(_T("RecalcInheritanceChain() : First iteration took : %ld ms"), sw.Time());
+//    sw.Start();
+//#endif
 
     // recalc
     TokenIdxSet result;
@@ -1057,16 +1057,20 @@
         {
             Token* anc_token = at(*it);
             if (anc_token)
+            {
                 TRACE(_T("RecalcInheritanceChain() :  + %s"), anc_token->m_Name.wx_str());
+            }
             else
+            {
                 TRACE(_T("RecalcInheritanceChain() :  + NULL?!"));
+            }
         }
     }
 #endif
 
-#if CC_TOKEN_DEBUG_OUTPUT
-    TRACE(_T("RecalcInheritanceChain() : Second iteration took : %ld ms"), sw.Time());
-#endif
+//#if CC_TOKEN_DEBUG_OUTPUT
+//    TRACE(_T("RecalcInheritanceChain() : Second iteration took : %ld ms"), sw.Time());
+//#endif
 
     TRACE(_T("RecalcInheritanceChain() : Full inheritance calculated."));
 }
@@ -1215,9 +1219,13 @@
             {
                 Token* anc_token = at(*it);
                 if (anc_token)
+                {
                     TRACE(_T("RecalcData() :  + %s"), anc_token->m_Name.wx_str());
+                }
                 else
+                {
                     TRACE(_T("RecalcData() :  + NULL?!"));
+                }
             }
         }
 #endif
Index: src/plugins/codecompletion/parser/parserthread.cpp
===================================================================
--- src/plugins/codecompletion/parser/parserthread.cpp (Revision 6677)
+++ src/plugins/codecompletion/parser/parserthread.cpp (Arbeitskopie)
@@ -1197,8 +1197,10 @@
 
         newToken = new(std::nothrow) Token(newname, m_FileIdx, line, ++m_pTokensTree->m_TokenTicketCount);
         if (newToken)
+        {
             TRACE(_T("DoAddToken() : Created token='%s', file_idx=%d, line=%d, ticket="), newname.wx_str(),
                   m_FileIdx, line, m_pTokensTree->m_TokenTicketCount);
+        }
         else
         {
             --m_pTokensTree->m_TokenTicketCount;

The commented out part (with the wxStopWatch) should probably be corrected instead of just commented out (at least if it still makes sense to measure the times).

Offline Loaden

  • Lives here!
  • ****
  • Posts: 1014
Re: The 25 september 2010 build (6634) CODECOMPLETION BRANCH version is out.
« Reply #71 on: October 08, 2010, 07:12:06 am »
By the way,  the following patch is needed to make C::B compile without a warning if CC_XXX_DEBUG_OUTPUT is set to 2.
I did not change all places (should all be checked I think), only the places which lead to a compiler warning:
Fixed in r6678.

Offline Loaden

  • Lives here!
  • ****
  • Posts: 1014
Re: The 25 september 2010 build (6634) CODECOMPLETION BRANCH version is out.
« Reply #72 on: October 08, 2010, 07:44:56 am »
The second time (parsing without) was done immediately after closing the project, so that most of the files are still in the systems hdd-buffer.
No, because the ParserThread instance of these files does not created.

Parsing the linux-kernel 2.6.35 only works with debug-output enebaled and redirected to a file (C::B uses up to 2.5 GB of memory and did not release it at project unload).
Using from within C::B or diretcly from console, either crashes C::B or freezes the whole system.
I can not reproduce in the Windows system, you can help find which file is causing the problem?
Thanks!

Offline Borr

  • Multiple posting newcomer
  • *
  • Posts: 29
Re: The 25 september 2010 build (6634) CODECOMPLETION BRANCH version is out.
« Reply #73 on: October 08, 2010, 09:23:06 am »
CB_CC_BRANCH_r6675 can't parce ole2.h with MinGW

Code
#include <ole2.h>
...
VARIANT param;
VariantInit(&param);
param./*Ctrl-Space show only dblVal*/

Offline Loaden

  • Lives here!
  • ****
  • Posts: 1014
Re: The 25 september 2010 build (6634) CODECOMPLETION BRANCH version is out.
« Reply #74 on: October 08, 2010, 11:27:58 am »
CB_CC_BRANCH_r6675 can't parce ole2.h with MinGW

Code
#include <ole2.h>
...
VARIANT param;
VariantInit(&param);
param./*Ctrl-Space show only dblVal*/

Sorry, this is too complex, my personal opinion, not yet plan to support it.