Developer forums (C::B DEVELOPMENT STRICTLY!) > CodeCompletion redesign

Macro replacement in CC(tokenizer) suggestion

(1/2) > >>

ollydbg:
Hi, In the current architecture, the function ThisOrReplacement(m_Token) was only called in Tokenizer::GetToken()


--- Code: ---wxString Tokenizer::GetToken()
{
    m_UndoTokenIndex = m_TokenIndex;
    m_UndoLineNumber = m_LineNumber;
    m_UndoNestLevel  = m_NestLevel;

    if(m_PeekAvailable)
    {
        m_TokenIndex = m_PeekTokenIndex;
        m_LineNumber = m_PeekLineNumber;
        m_NestLevel  = m_PeekNestLevel;
        m_Token   = m_PeekToken;
    }
    else
        m_Token = DoGetToken();

    m_PeekAvailable = false;

    return ThisOrReplacement(m_Token);
}
--- End code ---


To accelerate this "Macro replacement", I think it should be moved to DoGetToken.

Here are the reasons:

1, ThisOrReplacement(m_Token) internally use a wxString--> wxString map container, so, it will use a search algorithm in this map(normally this will cause a search on a balanced BST), this will take a lot of time.

2, we can avoid many situations to call this function, for example, when m_Token is '{'  or wxEmptyString or many other string that shouldn't need macro expansion.

Any comments?

Thanks

 

ollydbg:
Also, I suggest that when the Tokenizer return a Token( wxString ), it should also combined with a "type", which means the parser can use this type information to do Syntax Analysis.

If I can remember, Ceniza call this a "Smart lexer"  :D

thomas:

--- Quote from: ollydbg on June 29, 2009, 04:10:29 am ---internally use a wxString--> wxString map container, so, it will use a search algorithm in this map(normally this will cause a search on a balanced BST), this will take a lot of time.

2, we can avoid many situations to call this function, for example, when m_Token is '{'  or wxEmptyString or many other string that shouldn't need macro expansion.
--- End quote ---
I'd be careful with such an optimisation, since a map rarely needs to do more than 4-5 lookups in total, so if you add too many special cases, the resulting code will be slower (and at the same time more difficult to maintain).

ollydbg:

--- Quote from: thomas on June 29, 2009, 09:40:37 am ---I'd be careful with such an optimisation, since a map rarely needs to do more than 4-5 lookups in total, so if you add too many special cases, the resulting code will be slower (and at the same time more difficult to maintain).

--- End quote ---
Not fully understand you comments :(

I mean if we return a wxString from DoGetToken(),  (for example '{') as we know '{' certainly don't need to do macro replacement, so ,we can avoid calling ThisOrReplacement('{');.

Also, there are many wxString like '{' :D



thomas:

--- Quote from: ollydbg on June 29, 2009, 09:57:35 am ---
--- Quote from: thomas on June 29, 2009, 09:40:37 am ---I'd be careful with such an optimisation, since a map rarely needs to do more than 4-5 lookups in total, so if you add too many special cases, the resulting code will be slower (and at the same time more difficult to maintain).

--- End quote ---
Not fully understand you comments :(
--- End quote ---
I understood that your idea is to catch special cases which cannot possibly be macros, so they need not looked up in the map<wxString,wxString>.

In other words, replace code that looks like:
return the_map.find(token);

with something like:
if(token.IsEmpty() || (token == one_constant) || (token == two_constant) || (token == three_constant))
    return token;
else
    return the_map.find(token);


My point is that maps have O(log(n)) lookup, so unless a source has 20 billion preprocessor defines, it is really nothing to worry about. For "normal" amounts, log(n) will be something like 4, maybe 5. Let's assume the worst case of 5. One "operation" is a compare and a branch.

Adding a line like the above will remove 5 operations done by the map lookup in the best case, at a cost of 1-4 additional operations (average 2). So we save 5-1 = 4 operations in the best, and 5-2 = 3 operations in the average case.
In the worst case, it will add 4 operations to the existing 5, almost doubling the work.
This scenario might still be advantageous, but it's not likely that it will be a big win.

Now, you were talking of "many" cases. Let's say "many" means 10. In this case, we will do 1-10 operations (5 average) to eliminate the 5 lookups done by the map.
So, on the average, we replace 5 operations with 5 operations (zero win, but more complicated code), and in the worst case, we add 10 operations, tripling the amount of work done.

Navigation

[0] Message Index

[#] Next page

Go to full version