Several improvements to Code Completion plugin

Developer forums (C::B DEVELOPMENT STRICTLY!) > Development

<< < (21/29) > >>

ollydbg:

--- Quote from: Huki on June 23, 2014, 07:05:15 am ---Hmm, the first patch is needed IMO (i.e, "When a virtual target is selected, collect the defines for all child targets."). In the CB project for example, check the parts covered by SCI_LEXER macro in ScintillaBase.cxx (sdk\wxscintilla\src\scintilla\src\ScintillaBase.cxx). With the "All" virtual target, SCI_LEXER won't be defined, so the code covered by it won't be parsed by CC. I think when a virtual target is chosen, it is better to collect the defines set for the virtual target, and also all the defines set for its child targets.

--- End quote ---
OK, I commit the first patch in r9824. Thanks.

--- Quote ---The second patch is more of a performance concern. Most projects have a Debug and Release target (where a DEBUG macro is defined for the Debug target). In this case when the target is changed, the user expects that the DEBUG code is enabled or disabled accordingly, which means the full project needs to be reparsed with the new macro state. Personally I don't mind the time taken for full project reparse, so I'd suggest committing the second patch too. :)

--- End quote ---
I even thought that we can switch the macro definition set when parsing different targets' source files, not sure this is doable. I do care about the performance issue. When developing C::B, I ways switch the build target from time to time. (I use thread search plugin, I need the search only limited to the current active target). So, I personally don't like the idea that switching the target cause a reparse. But I would like to see all the comments about this.

Huki:

--- Quote from: ollydbg on June 23, 2014, 04:19:02 pm ---
--- Quote from: Huki on June 23, 2014, 07:05:15 am ---The second patch is more of a performance concern. Most projects have a Debug and Release target (where a DEBUG macro is defined for the Debug target). In this case when the target is changed, the user expects that the DEBUG code is enabled or disabled accordingly, which means the full project needs to be reparsed with the new macro state. Personally I don't mind the time taken for full project reparse, so I'd suggest committing the second patch too. :)

--- End quote ---
I even thought that we can switch the macro definition set when parsing different targets' source files, not sure this is doable. I do care about the performance issue. When developing C::B, I ways switch the build target from time to time. (I use thread search plugin, I need the search only limited to the current active target). So, I personally don't like the idea that switching the target cause a reparse. But I would like to see all the comments about this.

--- End quote ---
I understand about that.. I'm going to keep this feature aside for now, we'll see if we are able to find a better solution in the future.

Btw, I think you remember our fix for reliable UngetToken() behavior (see here for the original discussion). To recap, in PeekToken(): we are checking the value of m_IsReplaceParsing before and after DoGetToken() to see if a ReplaceBufferText() was done, and if it was done, we should reset the savedTokenIndex, etc.

Now, this way of checking m_IsReplaceParsing can fail in some cases:
1) We are already in replace mode (eg, m_IsReplaceParsing == 1) before DoGetToken(), and inside DoGetToken() m_IsReplaceParsing is reset to zero and a new macro replacement is started (once again m_IsReplaceParsing == 1). No change will be detected and we will fail to reset the savedToken position.
2) A preprocessor condition is hit in SkipUnwanted() and we run CalcConditionExpression(). Now, CalcConditionExpression() will not only expand the macro, but also parse the entire expanded macro and compute the result. By the time we exit SkipUnwanted(), m_IsReplaceParsing is still non-zero, but we are already at the end of the expanded macro. Now, the DoGetToken() will reset m_IsReplaceParsing to zero, and once again no change will be detected. i.e.,

--- Code: --- savedReplaceParsing = m_IsReplaceParsing; // m_IsReplaceParsing == 0

if (SkipUnwanted()) // ReplaceBufferText() was run and expanded macro parsed, m_IsReplaceParsing == 1
DoGetToken(); // out of replace buffer, m_IsReplaceParsing reset to 0

// check if ReplaceBufferText() was run
if (savedReplaceParsing < m_IsReplaceParsing) // condition failed
[...]

--- End code ---

Yesterday I was unlucky enough to hit both these cases 1) and 2) in the same code, so we definitely need some improvement.. ;)

I think the proper way of doing it is to make the saved___ variables class members of Tokenizer, so they can be accessed from ReplaceBufferText() and reset when required (just like we are resetting the m_Undo___ variables). We also need to reset Undo and Saved positions to the proper place after CalcConditionExpression() (because as I've said before it's not only expanding the macro but also parsing it fully, so we need to reset right after the preprocessor line has ended).

I'll post the patch with these changes and some test cases below.

Huki:
I'm attaching the patch below: (git patch against very recently synced master)
- Moved local variables saved___ to Tokenizer class, m_Saved___.
- Reset m_Undo and m_Saved vars to proper position in ReplaceBufferText().
- Reset m_Undo and m_Saved vars to proper position in CalcConditionExpression().
- Added missing line in GetPreprocessorType().
- Reverted your changes in UnGetToken() and left a note.

Now for some test cases:
1) When ReplaceBufferText() is done in DoGetToken(), (the usual case):

--- Code: ---Our code:
#define MYMACRO X=0
int X;
MYMACRO;
X = 1;

PeekToken() Begin:
   int X;MYMACRO;X=1;
Token: _^
Undo : ^
Peek :

Update m_SavedTokenIndex and run DoGetToken():
   int X;MYMACRO;X=1;
Token: _ ^
Save : ^
Undo : ^
Peek :

ReplaceBufferText() and reset m_TokenIndex:
   int X;MYMAX=0;X=1;
Token: _ ^
Save : ^
Undo : ^
Peek :

(Added by patch) Reset m_Undo and m_Save position:
   int X;MYMAX=0;X=1;
Token: _ ^
Save : ^
Undo : ^
Peek :

Get the next token:
(Note that the string will be saved in m_PeekToken,
m_Token will be untouched)
   int X;MYMAX=0;X=1;
Token: _ ^
Save : ^
Undo : ^
Peek : _

PeekToken() End: update m_PeekTokenIndex and restore m_TokenIndex from m_SaveTokenIndex:
   int X;MYMAX=0;X=1;
Token: _ ^
Undo : ^
Peek : _^

Now, user calls UnGetToken(): (m_PeekAvailable becomes true)
(Note that even though m_TokenIndex == m_UndoTokenIndex, we should allow UnGetToken()
to run anyway. Of course, we should only call it once after each GetToken())
   int X;MYMAX=0;X=1;
Token: ^
Undo : ^
Peek : _ ^

GetToken() again: (take from peek and reset m_PeekAvailable to false)
   int X;MYMAX=0;X=1;
Token: _ ^
Undo : ^
Peek :

--- End code ---

2) When CalcConditionExpression() is called in SkipUnwanted() immediately before DoGetToken():
Note that I have modified the example code to remove the pointer (it was originally "typedef RGBQUAD *LPRGBQUAD;"). But I don't think the current CC typedef support works well with pointers (and I'm yet to submit my patch for it).

2a) A simple case when there is no macro replacement involved:

--- Code: --- // just some code :)
  typedef struct tagRGBQUAD {
   char rgbBlue;
   char rgbGreen;
   char rgbRed;
   char rgbReserved;
  } RGBQUAD;

#if 1
  typedef RGBQUAD LPRGBQUAD;
#endif

PeekToken() Begin:
   RGBQUAD;#if 1 typedef RGBQUAD LPRGBQUAD;
Token _^
Undo ^
Peek

PeekToken(): SkipUnwanted():
Note: It will run CalcConditionExpression(), evaluate the result to
true and skip to the first character after the condition expression.
   RGBQUAD;#if 1 typedef RGBQUAD LPRGBQUAD;
Token _ ^
Save ^
Undo ^
Peek

PeekToken(): SkipUnwanted(): CalcConditionExpression() End:
We reset Save and Undo:
   RGBQUAD;#if 1 typedef RGBQUAD LPRGBQUAD;
Token _ ^
Save ^
Undo ^
Peek

PeekToken(): DoGetToken():
   RGBQUAD;#if 1 typedef RGBQUAD LPRGBQUAD;
Token _ ^
Save ^
Undo ^
Peek _______

PeekToken() End: Reset Token from Save:
   RGBQUAD;#if 1 typedef RGBQUAD LPRGBQUAD;
Token _ ^
Undo ^
Peek _______^

Now user runs UnGetToken(): m_PeekAvailable becomes true:
Note again as in our first example, TokenIndex == UndoIndex,
but we should run UnGetToken() anyway.
   RGBQUAD;#if 1 typedef RGBQUAD LPRGBQUAD;
Token ^
Undo ^
Peek _ ^

GetToken() again: Use peek and reset m_PeekAvailable:
   RGBQUAD;#if 1 typedef RGBQUAD LPRGBQUAD;
Token _ ^
Undo ^
Peek

--- End code ---

Of course, in this simple case there was no macro replacement, so although resetting the undo and save tokens was good, it was not really needed for proper parsing. In case of macro replacement it becomes necessary.

Huki:
2b) A (lot) more complex case of previous example, taken from the mingw-builds winapi code which gave me trouble yesterday:
Our code:

--- Code: ---#define WINAPI_PARTITION_DESKTOP 0x1
#define WINAPI_PARTITION_APP 0x2

#define WINAPI_FAMILY_APP WINAPI_PARTITION_APP
#define WINAPI_FAMILY_DESKTOP_APP (WINAPI_PARTITION_DESKTOP \
| WINAPI_PARTITION_APP)

/* WINAPI_FAMILY can be either desktop + App, or App. */
#ifndef WINAPI_FAMILY
#define WINAPI_FAMILY WINAPI_FAMILY_DESKTOP_APP
#endif

#define WINAPI_FAMILY_PARTITION(v) ((WINAPI_FAMILY & v) == v)
#define WINAPI_FAMILY_ONE_PARTITION(vset, v) ((WINAPI_FAMILY & vset) == v)

typedef struct tagRGBQUAD {
char rgbBlue;
char rgbGreen;
char rgbRed;
char rgbReserved;
} RGBQUAD;

#if WINAPI_FAMILY_PARTITION(WINAPI_PARTITION_DESKTOP)
typedef RGBQUAD LPRGBQUAD; // removed *
#endif

--- End code ---

Parsing:

--- Code: ---PeekToken() Begin:
RGBQUAD;#if WINAPI_FAMILY_PARTITION(WINAPI_PARTITION_DESKTOP)typedef RGBQUAD LPRGBQUAD;
Token _^
Undo ^
Peek

PeekToken(): SkipUnwanted(): GetPreprocessorType():
RGBQUAD;#if WINAPI_FAMILY_PARTITION(WINAPI_PARTITION_DESKTOP)typedef RGBQUAD LPRGBQUAD;
Token _ ^
Save ^
Undo ^
Peek

PeekToken(): SkipUnwanted(): CalcConditionExpression():
Note: This will expand the macro using ReplaceBufferText() which will
already reset Undo and Save positions. But is it enough?
RGBQUAD;#if WINAPI_FAMILY_PARTITION(WINAPI_PARTITION_DESKTOP)typedef RGBQUAD LPRGBQUAD;
Token _ ^
Save ^
Undo ^
Peek
((WINAPI_FAMILY & WINAPI_PARTITION_DESKTOP) == WINAPI_PARTITION_DESKTOP)typedef RGBQUAD LPRGBQUAD;
Token ^ ;
Save ^
Undo ^
Peek
WINAPI_FAMILY_DESKTOP_APP & WINAPI_PARTITION_DESKTOP) == WINAPI_PARTITION_DESKTOP)typedef RGBQUAD LPRGBQUAD;
Token ^ ;
Save ^
Undo ^
Peek
(WINAPI_PARTITION_DESKTOP|WINAPI_PARTITION_APP) & WINAPI_PARTITION_DESKTOP) == WINAPI_PARTITION_DESKTOP)typedef RGBQUAD LPRGBQUAD;
Token ^ ;
Save ^
Undo ^
Peek

PeekToken(): SkipUnwanted(): CalcConditionExpression():
Now the entire expanded macro will be parsed, so we are left at the end of it:
(WINAPI_PARTITION_DESKTOP|WINAPI_PARTITION_APP) & WINAPI_PARTITION_DESKTOP) == WINAPI_PARTITION_DESKTOP)typedef RGBQUAD LPRGBQUAD;
Token ; ^
Save ^
Undo ^
Peek

PeekToken(): SkipUnwanted(): CalcConditionExpression(): End:
The condition macro has lost its context (we lost the #if),
so we must reset undo and save to the end of the macro statement.
(WINAPI_PARTITION_DESKTOP|WINAPI_PARTITION_APP) & WINAPI_PARTITION_DESKTOP) == WINAPI_PARTITION_DESKTOP)typedef RGBQUAD LPRGBQUAD;
Token ; ^
Save ^
Undo ^
Peek

PeekToken(): DoGetToken():
(WINAPI_PARTITION_DESKTOP|WINAPI_PARTITION_APP) & WINAPI_PARTITION_DESKTOP) == WINAPI_PARTITION_DESKTOP)typedef RGBQUAD LPRGBQUAD;
Token ; ^
Save ^
Undo ^
Peek _______

PeekToken(): End: Reset Token from Save:
Note that the m_Token string is untouched, it still
contain semi-colon which we started with:
(WINAPI_PARTITION_DESKTOP|WINAPI_PARTITION_APP) & WINAPI_PARTITION_DESKTOP) == WINAPI_PARTITION_DESKTOP)typedef RGBQUAD LPRGBQUAD;
Token ; ^
Undo ^
Peek _______^

Now user runs UnGetToken(): m_PeekAvailable becomes true:
(WINAPI_PARTITION_DESKTOP|WINAPI_PARTITION_APP) & WINAPI_PARTITION_DESKTOP) == WINAPI_PARTITION_DESKTOP)typedef RGBQUAD LPRGBQUAD;
Token ^
Undo ^
Peek ; ^

GetToken() again: Take from peek:
(WINAPI_PARTITION_DESKTOP|WINAPI_PARTITION_APP) & WINAPI_PARTITION_DESKTOP) == WINAPI_PARTITION_DESKTOP)typedef RGBQUAD LPRGBQUAD;
Token ; ^
Undo ^
Peek

--- End code ---

ollydbg:
Hi, Huki, thanks for the detailed explanation. I understand the issue. Indeed, if the m_Buffer is modified when macro replacement happens(it can either happend in DoGetToken() call or SkipUnwanted() call), both the "old token index"(m_SavedTokenIndex) and "undo token index"(m_UndoTokenIndex) will becomes invalid. A simple example is that if we have a macro replacement rule:

--- Code: ---AAA->BBBBBBBBBBBBBBBBBBBBBBBBBBBBB
--- End code ---

After the replacement, the BBBBBBBBBBBBBBBBBBBBBBBBBBBBB can overwrite the m_Buffer[m_SavedTokenIndex], and m_Buffer[m_UndoTokenIndex], so both m_SavedTokenIndex and m_UndoTokenIndex should be reset. Since m_Token is still valid, we can still safely call void Tokenizer::UngetToken() once. In-fact, UngetToken() won't move any Indexs if they are reset.

I will commit your patch soon, thanks.

EDIT committed in r9834

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version