User forums > Nightly builds

The 30 December 2018 build (11543) is out.

<< < (8/13) > >>

ollydbg:

--- Quote from: oBFusCATed on January 03, 2019, 03:42:30 pm ---Here is patch which fixes one of the issues with this function....
...
I won't make any changes, because I don't understand the code here. @ollydbg would you be able to fix the issue now that we know what is causing it?

--- End quote ---

The KMP search related codes are added long times ago by Loaden, which try to locate an argument when we do macro expansion.
For example: If we have:

--- Code: ---#define M(a,b)  2*a+b

--- End code ---
Then we have a usage like:

--- Code: ---M(x,y)

--- End code ---
We have to search to search "a" in the string "2*a+b", and replace "a" with actual argument "x", so it becomes "2*x+b".
I hope I will take some time on this issue this weekend. Thanks.

ollydbg:

--- Quote from: BlueHazzard on January 03, 2019, 02:22:40 pm ---funny:


--- Code: ---> info locals
next = <error reading variable next (Cannot access memory at address 0x0)>
index = 0
i = 0
j = 0
--- End code ---

--- End quote ---

I just build the wx3.1.1 under MinGW-W64's 64bit gcc8.1 compiler(mingw-builds/8.1.0/threads-posix/seh/x86_64-8.1.0-release-posix-seh-rt_v6-rev0.7z). The final log-release.txt is more than 20M, too many warnings with the below command.

--- Code: ---mingw32-make -f makefile.gcc USE_XRC=1 SHARED=1 MONOLITHIC=1 BUILD=release UNICODE=1 USE_OPENGL=1 VENDOR=cb CXXFLAGS="-Wno-unused-local-typedefs -Wno-deprecated-declarations -fno-keep-inline-dllexport" >log-release.txt 2>&1

--- End code ---
Also, I have to this patch applied:Fix invalid cast in wxMSW AutoHANDLE::InvalidHandle() ยท wxWidgets/wxWidgets@424f64f, otherwise, the build will fall.

Then build C::B, I see the same crash.
This means the "int next[patternLen];" is corrupt.  :(

ollydbg:
run the CCTest project on the file "cc_macro_expansion_stringize.cpp"
(note, you need to copy this source file, and rename to ccc_macro_expansion_stringize.cpp, so that CCTest only run the test for this single file)

Set two breakpoints here:


--- Code: ---int Tokenizer::KMP_Find(const wxChar* text, const wxChar* pattern, const int patternLen)
{
    if (!text || !pattern || pattern[0] == _T('\0') || text[0] == _T('\0'))
        return -1;

    if (patternLen > 1024)
    {
        if (patternLen < 5012)
            TRACE(_T("KMP_Find() : %s - %s"), text, pattern);
        else
        {
            TRACE(_T("KMP_Find: The plan buffer is too big, %d"), patternLen);
            return -2;
        }
    }

    int next[patternLen];
    KMP_GetNextVal(pattern, next);  //bp1

    int index = 0, i = 0, j = 0;    //bp2
--- End code ---
   
This the the value before we call the function KMP_GetNextVal

--- Code: ---> info locals

[debug]> info locals
[debug]next = {2283936, 0, 1875784749, 0}
[debug]index = 0
[debug]i = 14
[debug]j = 0
[debug]>>>>>>cb_gdb:

next = {2283936, 0, 1875784749, 0}
index = 0
i = 14
j = 0
--- End code ---

And after the function call


--- Code: ---> info locals

[debug]> info locals
[debug]next = <error reading variable next (Cannot access memory at address 0x1)>
[debug]index = 0
[debug]i = 14
[debug]j = 0
[debug]>>>>>>cb_gdb:

next = <error reading variable next (Cannot access memory at address 0x1)>
index = 0
i = 14
j = 0
--- End code ---

This means this function has some errors.
Note, the "next" array is the lps array stated in
https://www.geeksforgeeks.org/kmp-algorithm-for-pattern-searching/
or
KMP Algorithm | Searching for Patterns | GeeksforGeeks - YouTube

EDIT1:
For a pattern "text", when step into the function:

--- Code: ---void Tokenizer::KMP_GetNextVal(const wxChar* pattern, int next[])
{
    int j = 0, k = -1;
    next[0] = -1;
    while (pattern[j] != _T('\0'))
    {
        if (k == -1 || pattern[j] == pattern[k])
        {
            ++j;
            ++k;
            if (pattern[j] != pattern[k])
                next[j] = k;  // error
            else
                next[j] = next[k];
        }
        else
            k = next[k];
    }
}
--- End code ---
I do see that the line "//error", has j=4, which means next[j] is beyond the next (since next array only have four elements). :(
But I still need some time to see how the KMP algorithm works.

EDIT2:
To simplify the issue, you only need to debug this function:

--- Code: ---int Tokenizer::GetFirstTokenPosition(const wxChar* buffer, const size_t bufferLen,
                                     const wxChar* key, const size_t keyLen)
{
    int pos = -1;
    wxChar* p = const_cast<wxChar*>(buffer);
    const wxChar* endBuffer = buffer + bufferLen;
    for (;;)
    {
        const int ret = KMP_Find(p, key, keyLen);
        if (ret == -1)
            break;

        // check previous char
        p += ret;
        if (p > buffer)
        {
            const wxChar ch = *(p - 1);
            if (ch == _T('_') || wxIsalnum(ch))
            {
                p += keyLen;
                continue;
            }
        }

        // check next char
        p += keyLen;
        if (p < endBuffer)
        {
            const wxChar ch = *p;
            if (ch == _T('_') || wxIsalnum(ch))
                continue;
        }

        // got it
        pos = p - buffer - keyLen;
        break;
    }

    return pos;
}

--- End code ---
Where, the arguments are:

--- Code: ---[debug]> info args
[debug]this = 0x41d4200
[debug]buffer = 0x41d7bf0 L"text ## line"
[debug]bufferLen = 12
[debug]key = 0x41d7868 L"text"
[debug]keyLen = 4
[debug]>>>>>>cb_gdb:

this = 0x41d4200
buffer = 0x41d7bf0 L"text ## line"
bufferLen = 12
key = 0x41d7868 L"text"
keyLen = 4

--- End code ---

Miguel Gimenez:
You can check this code adapted from your link:


--- Code: ---void Tokenizer::KMP_GetNextVal(const wxChar* pattern, int next[])
{
    int len = 0;

    next[0] = 0;  // CB code uses -1

    int i = 1;
    while (pattern[i] != _T('\0'))
    {
        if (pattern[i] == pattern[len])
        {
            len++;
            next[i] = len;
            i++;
        }
        else
        {
            if (len)
            {
                len = next[len - 1];
            }
            else
            {
                next[i] = 0;
                i++;
            }
        }
    }
}

--- End code ---

Also, I would try using a std::vector or new[] instead of a dynamic array for next.

oBFusCATed:

--- Quote from: Miguel Gimenez on January 05, 2019, 11:46:45 am ---Also, I would try using a std::vector or new[] instead of a dynamic array for next.

--- End quote ---
And you'll make the parser slow... There is already a known limit on the number of elements in this array. It can be used. But this won't solve the problem.

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version