The problem is located in this function in Tokenizer.cpp
bool Tokenizer::SkipToCharBreak()
{
if (PreviousChar() != '\\')
return true;
else
{
// check for "\\"
if ( ((m_TokenIndex - 2) >= 0)
&& ((m_TokenIndex - 2) <= m_BufferLen)
&& (m_Buffer.GetChar(m_TokenIndex - 2) == '\\') )
return true;
}
return false;
}
The main task of this function is try to check the current char is in a "C escape character".
For example, in the statement below:
There are three quots in the statement above. the first quot and third quot is the start and the end of a C string definition. But the second quot is after a back slash \, is part of an escape character.
The parser starts the search from the first quot, and try to find the end of the C string. When it meets the second quot, This function try to check whether the second quot is the end of the C string.
But this function mistakenly regard the second quot as the end of the C string.
Also, this function checks several characters before the second quot, but that's not enough. In this statement
, the first two back slash
is an escape character, and the third and forth
groups another escape character.
So, how about parsing this statement?
I think we should left search from the second quot and count the back slash number until we meet the first non-back-slash char. and check the number is odd or even.
Any comments?