Author Topic: The 25 september 2010 build (6634) CODECOMPLETION BRANCH version is out.  (Read 129929 times)

Offline Loaden

  • Lives here!
  • ****
  • Posts: 1014
Re: The 25 september 2010 build (6634) CODECOMPLETION BRANCH version is out.
« Reply #30 on: October 03, 2010, 11:56:42 am »
@ Loaden:
I test the patch and give you feedback.

First endless-loop seems to be fixed, now we get one some lines later.

Code
ReplaceBufferForReparse() : <FROM>printf<TO>printk
ReplaceBufferForReparse() : <FROM>printk<TO>printf
tk->m_Type is printf and token is printk and vice versa.

Can this be handled in ReplaceBufferForReparse, so it only has to be done in one place ?
This seems very strange! Jens, could you give me more information?

Offline Jenna

  • Administrator
  • Lives here!
  • *****
  • Posts: 7252
Re: The 25 september 2010 build (6634) CODECOMPLETION BRANCH version is out.
« Reply #31 on: October 03, 2010, 02:40:27 pm »
There are several headers with #define printf printk and some others with #define printk printf.

As before I am not able to strip the sources down to a small project (the actual one uses 47 MB if it is 7zipped).

After commen ting out all #define printf's I get:
Code
ReplaceBufferForReparse() : <FROM>printk<TO>printf
ReplaceBufferForReparse() : <FROM>(             printf<TO>(args...)
ReplaceBufferForReparse() : <FROM>args...)(args)<TO>            printk(args)
ReplaceBufferForReparse() : <FROM>printk<TO>printf
ReplaceBufferForReparse() : <FROM>(             printf<TO>(args...)
ReplaceBufferForReparse() : <FROM>args...)(args)<TO>            printk(args)
ReplaceBufferForReparse() : <FROM>printk<TO>printf
ReplaceBufferForReparse() : <FROM>(             printf<TO>(args...)
endless as before, m_TokenIndex does not change if I set the brakpoint before ReplaceBufferForReparse(tk->m_Type, false); .

Offline Loaden

  • Lives here!
  • ****
  • Posts: 1014
Re: The 25 september 2010 build (6634) CODECOMPLETION BRANCH version is out.
« Reply #32 on: October 03, 2010, 04:14:27 pm »
@ Loaden:
I test the patch and give you feedback.

First endless-loop seems to be fixed, now we get one some lines later.

Code
ReplaceBufferForReparse() : <FROM>printf<TO>printk
ReplaceBufferForReparse() : <FROM>printk<TO>printf
tk->m_Type is printf and token is printk and vice versa.

Can this be handled in ReplaceBufferForReparse, so it only has to be done in one place ?
Thanks! This code lead CB endless loop.
I will fix it soon.
Code
#define AAA(x) BBB(x)
#define BBB(y) AAA(y)

#if AAA(1) && BBB(2)
void fly();
#else
void good();
#endif

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 6034
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: The 25 september 2010 build (6634) CODECOMPLETION BRANCH version is out.
« Reply #33 on: October 03, 2010, 04:22:35 pm »
Thanks! This code lead CB endless loop.
I will fix it soon.
Code
#define AAA(x) BBB(x)
#define BBB(y) AAA(y)

#if AAA(1) && BBB(2)
void fly();
#else
void good();
#endif

But the code above does not compile. :D
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline Loaden

  • Lives here!
  • ****
  • Posts: 1014
Re: The 25 september 2010 build (6634) CODECOMPLETION BRANCH version is out.
« Reply #34 on: October 03, 2010, 04:30:05 pm »
Thanks! This code lead CB endless loop.
I will fix it soon.
Code
#define AAA(x) BBB(x)
#define BBB(y) AAA(y)

#if AAA(1) && BBB(2)
void fly();
#else
void good();
#endif

But the code above does not compile. :D
But we still have to avoid all possible infinite loop. :lol:

Offline Jenna

  • Administrator
  • Lives here!
  • *****
  • Posts: 7252
Re: The 25 september 2010 build (6634) CODECOMPLETION BRANCH version is out.
« Reply #35 on: October 03, 2010, 04:34:37 pm »
Thanks! This code lead CB endless loop.
I will fix it soon.
Code
#define AAA(x) BBB(x)
#define BBB(y) AAA(y)

#if AAA(1) && BBB(2)
void fly();
#else
void good();
#endif

But the code above does not compile. :D

The kernel code surely does not compile also (all files added to a C::B project without running any configuration-scripts and without using makefiles), but it leads to an endless loop when parsing the sources and that should (of course) not happen, even if the code has errors.

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 6034
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: The 25 september 2010 build (6634) CODECOMPLETION BRANCH version is out.
« Reply #36 on: October 03, 2010, 04:40:32 pm »
The kernel code surely does not compile also (all files added to a C::B project without running any configuration-scripts and without using makefiles), but it leads to an endless loop when parsing the sources and that should (of course) not happen, even if the code has errors.

But we still have to avoid all possible infinite loop. :lol:

Yes, 100% agree.
Maybe, we could limit the macro expansion level to avoid the recursive infinite loop.
Eg, once a macro definition is used to do macro expansion, it will never be used again in deeper level expansion.
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline Loaden

  • Lives here!
  • ****
  • Posts: 1014
Re: The 25 september 2010 build (6634) CODECOMPLETION BRANCH version is out.
« Reply #37 on: October 03, 2010, 04:56:10 pm »
Maybe, we could limit the macro expansion level to avoid the recursive infinite loop.
Agreed! This is a good idea. :D

Offline Loaden

  • Lives here!
  • ****
  • Posts: 1014
Re: The 25 september 2010 build (6634) CODECOMPLETION BRANCH version is out.
« Reply #38 on: October 03, 2010, 05:30:52 pm »
The kernel code surely does not compile also (all files added to a C::B project without running any configuration-scripts and without using makefiles), but it leads to an endless loop when parsing the sources and that should (of course) not happen, even if the code has errors.
Hi, Jens, Please trying this patch?

Offline Jenna

  • Administrator
  • Lives here!
  • *****
  • Posts: 7252
Re: The 25 september 2010 build (6634) CODECOMPLETION BRANCH version is out.
« Reply #39 on: October 03, 2010, 11:03:05 pm »
The kernel code surely does not compile also (all files added to a C::B project without running any configuration-scripts and without using makefiles), but it leads to an endless loop when parsing the sources and that should (of course) not happen, even if the code has errors.
Hi, Jens, Please trying this patch?

The endless loops do no longer occur.

Parsing takes a little more than 28 minutes, that's very long, but I get more than 1 million tokens.
One problem is the amount of memory used (more than 2 GB).
And it seems parsing is not correct, I get some strange tokens in symbol browser (see the attached picture).
I am currently uploading a 7z-file, that contains the tokens tree and the outher lists from cc debug tool.
That will take some time (just ISDN here), because it's about 18 MB (105 MB tokens tree).

Another issue (not related to this), is that the symboslbrowser stays empty, if the project is opened without any shown editor.
After opening any file the symbols-browser gets filled.

« Last Edit: October 03, 2010, 11:05:08 pm by jens »

Offline Jenna

  • Administrator
  • Lives here!
  • *****
  • Posts: 7252
Re: The 25 september 2010 build (6634) CODECOMPLETION BRANCH version is out.
« Reply #40 on: October 03, 2010, 11:51:36 pm »
cc debug-info: http://apt.jenslody.de/cc/cc_debug.7z

As written before, the test-project is generated from the (debian) linux-kernel sources, revision 2.6.29.
No build-options or defines, just all files from sources added to an empty project.

Offline Loaden

  • Lives here!
  • ****
  • Posts: 1014
Re: The 25 september 2010 build (6634) CODECOMPLETION BRANCH version is out.
« Reply #41 on: October 04, 2010, 12:00:51 am »
cc debug-info: http://apt.jenslody.de/cc/cc_debug.7z

As written before, the test-project is generated from the (debian) linux-kernel sources, revision 2.6.29.
No build-options or defines, just all files from sources added to an empty project.
Thank Jens! I will look into it carefully.

Offline Loaden

  • Lives here!
  • ****
  • Posts: 1014
Re: The 25 september 2010 build (6634) CODECOMPLETION BRANCH version is out.
« Reply #42 on: October 04, 2010, 02:44:18 am »
Another issue (not related to this), is that the symboslbrowser stays empty, if the project is opened without any shown editor.
After opening any file the symbols-browser gets filled.
Confirmed!

EDIT: It's can confirmed only Linux, and in Windows, no have this issue.
Here:
Quote
bool Parser::Done()
{
    wxCriticalSectionLocker locker(s_ParserCritical);
    bool done =    m_UpFrontHeaders.IsEmpty()
                && m_SystemUpFrontHeaders.IsEmpty()
                && m_BatchParseFiles.IsEmpty()
                && m_PredefinedMacros.IsEmpty()
                && !m_NeedMarkFileAsLocal // When parsing end, this value be true always in Linux/GCC4.5.1!
                && m_PoolTask.empty()
                && m_Pool.Done();
    return done;
}
« Last Edit: October 04, 2010, 05:24:23 am by Loaden »

Offline Loaden

  • Lives here!
  • ****
  • Posts: 1014
Re: The 25 september 2010 build (6634) CODECOMPLETION BRANCH version is out.
« Reply #43 on: October 04, 2010, 03:03:21 am »
answer to these questions:
Parsing takes a little more than 28 minutes, that's very long, but I get more than 1 million tokens.
1, the batch parsing time is too long:
the Linux source code you supplied contains a lot of code snippet which has grammar errors like
Code
#define AAA(x) BBB(x)
#define BBB(y) AAA(y)

#if AAA(1) && BBB(2)
void fly();
#else
void good();
#endif
These code will increase the loop time(at least 100 times add compare to normal code), so the performace is bad.

And it seems parsing is not correct, I get some strange tokens in symbol browser (see the attached picture).
2. parsing error, wrong tokens:
This was still due to the highly mixed pre-processor code, our CC only do a format match, like

"AAA BBB();" ---> BBB is a function.
"AAA BBB;"  ---> BBB is a variable.

Thus, CC is not so smart to detect and handle grammar errors. (the only method is do a "full pre-processor" before parsing)

Offline Jenna

  • Administrator
  • Lives here!
  • *****
  • Posts: 7252
Re: The 25 september 2010 build (6634) CODECOMPLETION BRANCH version is out.
« Reply #44 on: October 04, 2010, 07:36:39 am »
This sounds reasonable.
Nevertheless, there are tokens that contain (or even begin with) whitespace, that means these tokens are invaild, so they could be (should be ?) securely removed.