answer to these questions:
Parsing takes a little more than 28 minutes, that's very long, but I get more than 1 million tokens.
1, the batch parsing time is too long:
the Linux source code you supplied contains a lot of code snippet which has grammar errors like
#define AAA(x) BBB(x)
#define BBB(y) AAA(y)
#if AAA(1) && BBB(2)
void fly();
#else
void good();
#endif
These code will increase the loop time(at least 100 times add compare to normal code), so the performace is bad.
And it seems parsing is not correct, I get some strange tokens in symbol browser (see the attached picture).
2. parsing error, wrong tokens:
This was still due to the highly mixed pre-processor code, our CC only do a format match, like
"AAA BBB();" ---> BBB is a function.
"AAA BBB;" ---> BBB is a variable.
Thus, CC is not so smart to detect and handle grammar errors. (the only method is do a "full pre-processor" before parsing)