User forums > Nightly builds
The 30 December 2018 build (11543) is out.
ollydbg:
--- Quote from: oBFusCATed on January 03, 2019, 03:42:30 pm ---Here is patch which fixes one of the issues with this function....
...
I won't make any changes, because I don't understand the code here. @ollydbg would you be able to fix the issue now that we know what is causing it?
--- End quote ---
The KMP search related codes are added long times ago by Loaden, which try to locate an argument when we do macro expansion.
For example: If we have:
--- Code: ---#define M(a,b) 2*a+b
--- End code ---
Then we have a usage like:
--- Code: ---M(x,y)
--- End code ---
We have to search to search "a" in the string "2*a+b", and replace "a" with actual argument "x", so it becomes "2*x+b".
I hope I will take some time on this issue this weekend. Thanks.
ollydbg:
--- Quote from: BlueHazzard on January 03, 2019, 02:22:40 pm ---funny:
--- Code: ---> info locals
next = <error reading variable next (Cannot access memory at address 0x0)>
index = 0
i = 0
j = 0
--- End code ---
--- End quote ---
I just build the wx3.1.1 under MinGW-W64's 64bit gcc8.1 compiler(mingw-builds/8.1.0/threads-posix/seh/x86_64-8.1.0-release-posix-seh-rt_v6-rev0.7z). The final log-release.txt is more than 20M, too many warnings with the below command.
--- Code: ---mingw32-make -f makefile.gcc USE_XRC=1 SHARED=1 MONOLITHIC=1 BUILD=release UNICODE=1 USE_OPENGL=1 VENDOR=cb CXXFLAGS="-Wno-unused-local-typedefs -Wno-deprecated-declarations -fno-keep-inline-dllexport" >log-release.txt 2>&1
--- End code ---
Also, I have to this patch applied:Fix invalid cast in wxMSW AutoHANDLE::InvalidHandle() ยท wxWidgets/wxWidgets@424f64f, otherwise, the build will fall.
Then build C::B, I see the same crash.
This means the "int next[patternLen];" is corrupt. :(
ollydbg:
run the CCTest project on the file "cc_macro_expansion_stringize.cpp"
(note, you need to copy this source file, and rename to ccc_macro_expansion_stringize.cpp, so that CCTest only run the test for this single file)
Set two breakpoints here:
--- Code: ---int Tokenizer::KMP_Find(const wxChar* text, const wxChar* pattern, const int patternLen)
{
if (!text || !pattern || pattern[0] == _T('\0') || text[0] == _T('\0'))
return -1;
if (patternLen > 1024)
{
if (patternLen < 5012)
TRACE(_T("KMP_Find() : %s - %s"), text, pattern);
else
{
TRACE(_T("KMP_Find: The plan buffer is too big, %d"), patternLen);
return -2;
}
}
int next[patternLen];
KMP_GetNextVal(pattern, next); //bp1
int index = 0, i = 0, j = 0; //bp2
--- End code ---
This the the value before we call the function KMP_GetNextVal
--- Code: ---> info locals
[debug]> info locals
[debug]next = {2283936, 0, 1875784749, 0}
[debug]index = 0
[debug]i = 14
[debug]j = 0
[debug]>>>>>>cb_gdb:
next = {2283936, 0, 1875784749, 0}
index = 0
i = 14
j = 0
--- End code ---
And after the function call
--- Code: ---> info locals
[debug]> info locals
[debug]next = <error reading variable next (Cannot access memory at address 0x1)>
[debug]index = 0
[debug]i = 14
[debug]j = 0
[debug]>>>>>>cb_gdb:
next = <error reading variable next (Cannot access memory at address 0x1)>
index = 0
i = 14
j = 0
--- End code ---
This means this function has some errors.
Note, the "next" array is the lps array stated in
https://www.geeksforgeeks.org/kmp-algorithm-for-pattern-searching/
or
KMP Algorithm | Searching for Patterns | GeeksforGeeks - YouTube
EDIT1:
For a pattern "text", when step into the function:
--- Code: ---void Tokenizer::KMP_GetNextVal(const wxChar* pattern, int next[])
{
int j = 0, k = -1;
next[0] = -1;
while (pattern[j] != _T('\0'))
{
if (k == -1 || pattern[j] == pattern[k])
{
++j;
++k;
if (pattern[j] != pattern[k])
next[j] = k; // error
else
next[j] = next[k];
}
else
k = next[k];
}
}
--- End code ---
I do see that the line "//error", has j=4, which means next[j] is beyond the next (since next array only have four elements). :(
But I still need some time to see how the KMP algorithm works.
EDIT2:
To simplify the issue, you only need to debug this function:
--- Code: ---int Tokenizer::GetFirstTokenPosition(const wxChar* buffer, const size_t bufferLen,
const wxChar* key, const size_t keyLen)
{
int pos = -1;
wxChar* p = const_cast<wxChar*>(buffer);
const wxChar* endBuffer = buffer + bufferLen;
for (;;)
{
const int ret = KMP_Find(p, key, keyLen);
if (ret == -1)
break;
// check previous char
p += ret;
if (p > buffer)
{
const wxChar ch = *(p - 1);
if (ch == _T('_') || wxIsalnum(ch))
{
p += keyLen;
continue;
}
}
// check next char
p += keyLen;
if (p < endBuffer)
{
const wxChar ch = *p;
if (ch == _T('_') || wxIsalnum(ch))
continue;
}
// got it
pos = p - buffer - keyLen;
break;
}
return pos;
}
--- End code ---
Where, the arguments are:
--- Code: ---[debug]> info args
[debug]this = 0x41d4200
[debug]buffer = 0x41d7bf0 L"text ## line"
[debug]bufferLen = 12
[debug]key = 0x41d7868 L"text"
[debug]keyLen = 4
[debug]>>>>>>cb_gdb:
this = 0x41d4200
buffer = 0x41d7bf0 L"text ## line"
bufferLen = 12
key = 0x41d7868 L"text"
keyLen = 4
--- End code ---
Miguel Gimenez:
You can check this code adapted from your link:
--- Code: ---void Tokenizer::KMP_GetNextVal(const wxChar* pattern, int next[])
{
int len = 0;
next[0] = 0; // CB code uses -1
int i = 1;
while (pattern[i] != _T('\0'))
{
if (pattern[i] == pattern[len])
{
len++;
next[i] = len;
i++;
}
else
{
if (len)
{
len = next[len - 1];
}
else
{
next[i] = 0;
i++;
}
}
}
}
--- End code ---
Also, I would try using a std::vector or new[] instead of a dynamic array for next.
oBFusCATed:
--- Quote from: Miguel Gimenez on January 05, 2019, 11:46:45 am ---Also, I would try using a std::vector or new[] instead of a dynamic array for next.
--- End quote ---
And you'll make the parser slow... There is already a known limit on the number of elements in this array. It can be used. But this won't solve the problem.
Navigation
[0] Message Index
[#] Next page
[*] Previous page
Go to full version