Author Topic: using memory pool for the SearchTree Node  (Read 17146 times)

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5915
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
using memory pool for the SearchTree Node
« on: July 04, 2009, 08:15:31 am »
Hi, here is the patch, I just make a simple test, it can reduce 25% parsing time.

Code
Index: searchtree.h
===================================================================
--- searchtree.h (revision 5679)
+++ searchtree.h (working copy)
@@ -5,7 +5,8 @@
 #ifndef SEARCHTREE_H
 #define SEARCHTREE_H
 
-#include <wx/string.h>
+#include <wx/string.h>
+#include "blockallocated.h"
 
 #include <vector>
 #include <map>
@@ -74,7 +75,7 @@
         SearchTreePoint (nSearchTreeNode nn, size_t dd) { n = nn; depth = dd; }
 };
 
-class SearchTreeNode
+class SearchTreeNode:public BlockAllocated<SearchTreeNode, 10000>
 {
     friend class BasicSearchTree;
     friend class BasicSearchTreeIterator;

Can any one tested for me?

By the way, can any developer would like to use the way I stated in this post?

http://forums.codeblocks.org/index.php/topic,10598.msg72616.html#msg72616

Thanks! :D
« Last Edit: July 04, 2009, 04:36:44 pm by ollydbg »
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline thomas

  • Administrator
  • Lives here!
  • *****
  • Posts: 3979
Re: using memory pool for the SearchTree Node
« Reply #1 on: July 04, 2009, 11:07:51 am »
This is exactly what I wrote block allocator for. :-)
However, the code completion parser was later made multithreaded in an attempt to speed it up, and the block allocator is (obviously) not thread-safe. Well, in fact it is thread safe, but not concurrency safe. Two threads allocating at the exact same time may get the same block of valid memory, and two blocks freeing at the exact same time may leak a block of memory.

So, your patch will likely work, but not 100% reliably. To make the block allocator thread-safe, one would have to use an atomic compare/exchange for the single-linked list of free nodes, but then again, this would make it slower.
« Last Edit: July 04, 2009, 11:09:31 am by thomas »
"We should forget about small efficiencies, say about 97% of the time: Premature quotation is the root of public humiliation."

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5915
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: using memory pool for the SearchTree Node
« Reply #2 on: July 04, 2009, 11:19:39 am »
Thanks for the reply.
I noticed that the current Token is already allocated from memory pool.

Code
class Token  : public BlockAllocated<Token, 10000>

seems it always works fine :D
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5915
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: using memory pool for the SearchTree Node
« Reply #3 on: July 04, 2009, 11:40:06 am »
Ok, I find some code that can confirm it is quite "SAFE", When adding a Token, it use a s_MutexProtection. In the parsing thread.
see the code below:

Code
Token* ParserThread::DoAddToken(TokenKind kind, const wxString& name, int line, int implLineStart, int implLineEnd, const wxString& args, bool isOperator, bool isImpl)
{
    if(TestDestroy())
        return 0;

    if (name.IsEmpty())
        return 0; // oops!

//    if (m_Options.useBuffer && !m_Options.isTemp && TokenExists(name, m_pLastParent, kind))
//        return 0;
    s_MutexProtection.Enter();
    Token* newToken = 0;
    wxString newname(name);
    m_Str.Trim();
//if (name == _T("AlsoDoSomething"))
//    asm("int $3;");
    if (kind == tkDestructor)
    {
        // special class destructors case
        newname.Prepend(ParserConsts::tilde);
        m_Str.Clear();
    }

    Token* localParent = 0;

    // preserve m_EncounteredTypeNamespaces; needed further down this function
    std::queue<wxString> q = m_EncounteredTypeNamespaces;
    if ((kind == tkDestructor || kind == tkConstructor) && !q.empty())
    {
        // look in m_EncounteredTypeNamespaces
        localParent = FindTokenFromQueue(q, 0, true, m_pLastParent);
        if (localParent)
            newToken = TokenExists(newname, localParent);
    }

    // check for implementation member function
    if (!newToken && !m_EncounteredNamespaces.empty())
    {
        localParent = FindTokenFromQueue(m_EncounteredNamespaces, 0, true, m_pLastParent);
        if (localParent)
            newToken = TokenExists(newname, localParent);
    }

    // none of the above; check for token under parent (but not if we 're parsing a temp buffer)
    if (!newToken && !m_Options.isTemp)
        newToken = TokenExists(name, m_pLastParent, kind);

    if (newToken && newToken->m_TokenKind == kind && newToken->m_Args == args)
    {
        m_pTokens->m_modified = true;
    }
    else
    {
        Token* finalParent = localParent ? localParent : m_pLastParent;
        newToken = new Token(newname,m_File,line);
        newToken->m_ParentIndex = finalParent ? finalParent->GetSelf() : -1;
        newToken->m_TokenKind = kind;
        newToken->m_Scope = m_LastScope;
        newToken->m_Args = args;
        int newidx = m_pTokens->insert(newToken);
        if (finalParent)
            finalParent->AddChild(newidx);
    }

    if (!(kind & (tkConstructor | tkDestructor)))
    {
        wxString readType = m_Str;
        wxString actualType = GetActualTokenType();
        if (actualType.Find(_T(' ')) == wxNOT_FOUND)
        {
            // token type must contain all namespaces
            actualType.Prepend(GetQueueAsNamespaceString(m_EncounteredTypeNamespaces));
        }
        newToken->m_Type = readType;
        newToken->m_ActualType = actualType;
    }
    newToken->m_IsLocal = m_IsLocal;
    newToken->m_IsTemp = m_Options.isTemp;
    newToken->m_IsOperator = isOperator;

    if (!isImpl)
    {

        newToken->m_File = m_File;
        newToken->m_Line = line;
//        Manager::Get()->GetLogManager()->DebugLog(F(_T("Added/updated token '%s' (%d), type '%s', actual '%s'. Parent is %s (%d)"), name.c_str(), newToken->GetSelf(), newToken->m_Type.c_str(), newToken->m_ActualType.c_str(), newToken->GetParentName().c_str(), newToken->m_ParentIndex));
    }
    else
    {
        newToken->m_ImplFile = m_File;
        newToken->m_ImplLine = line;
        newToken->m_ImplLineStart = implLineStart;
        newToken->m_ImplLineEnd = implLineEnd;
        m_pTokens->m_FilesMap[newToken->m_ImplFile].insert(newToken->GetSelf());
    }

    while (!m_EncounteredNamespaces.empty())
        m_EncounteredNamespaces.pop();
    while (!m_EncounteredTypeNamespaces.empty())
        m_EncounteredTypeNamespaces.pop();

    s_MutexProtection.Leave();
//            wxMilliSleep(0);
    return newToken;
}
« Last Edit: July 04, 2009, 11:42:41 am by ollydbg »
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline Jenna

  • Administrator
  • Lives here!
  • *****
  • Posts: 7255
Re: using memory pool for the SearchTree Node
« Reply #4 on: July 04, 2009, 11:58:12 am »
Parsing a large test-project (linux-kernel 2.6.29):

C::B from wxAui test-branch without your patch:
Quote
Parsing stage done (22382 total parsed files, 914536 tokens in 2 minute(s), 21.896 seconds).

and with your patch applied:
Quote
Parsing stage done (22382 total parsed files, 914536 tokens in 2 minute(s), 22.432 seconds).

Exact the same code except for your patch, clean compile in both cases, same system-load.
I don't see a time improvement here (debian 64-bit Core2Duo 2x2GB, 2x4MB cache).
But as thomas stated it might be dangerous in some (most likely very rare) cases.

Did not (yet) check on any other system.

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5915
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: using memory pool for the SearchTree Node
« Reply #5 on: July 04, 2009, 12:08:17 pm »
@Jens:

I will test it again. ( I think every test would be run after a system reboot :()
The last test, I just compare the "official nightly build" with my own build version( which as many code changes in CC).

I have exam the code carefully, when a parserThread need to add a Token, it would be protect by a Mux. The Core code was in:

Code
int TokensTree::AddToken(Token* newToken,int forceidx)
{
    ...
    size_t idx2 = m_Tree.AddItem(name,tmp_tokens,false);// this will add a now item to "TokenSearchTree"
    ...
}

I will give the report then.

Edit
By the way, I only have Windows XP at hand.
Forget to say: thanks for your test!

« Last Edit: July 04, 2009, 12:10:29 pm by ollydbg »
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline thomas

  • Administrator
  • Lives here!
  • *****
  • Posts: 3979
Re: using memory pool for the SearchTree Node
« Reply #6 on: July 04, 2009, 01:29:47 pm »
Ok, I find some code that can confirm it is quite "SAFE", When adding a Token, it use a s_MutexProtection. In the parsing thread.
That's an example of what happens if you fix things that you've not understood, and it's one of the reasons I'm not touching anything related to CC again :)

The major reason, apart from cache locality of data and reduced fragmentation, to implement the block allocator was, long long ago, to avoid the heap lock (and, the bucket search), which was fine because there was only one thread using these objects.
Then, threads were added, which led to occasional weird behaviour that nobody could understand (and some very rare crashes that were not related to the block allocator).
Throwing threads at the beast made explicit locking necessary, so the net gain was exactly zero. On top of that the (misnomed) mutex has at least 4 global definitions in different compilation units, which somewhat defeats its purpose as a lock :)
"We should forget about small efficiencies, say about 97% of the time: Premature quotation is the root of public humiliation."

Offline thomas

  • Administrator
  • Lives here!
  • *****
  • Posts: 3979
Re: using memory pool for the SearchTree Node
« Reply #7 on: July 04, 2009, 01:42:41 pm »
In case you wonder what I'm talking about, try this:

foo.h
Code
#include <cstdio>

static int xxx;

void foo();
void p1();
foo.cpp
Code
#include "foo.h"

void foo() { xxx = 5; }
void p1()  { printf("%d\n", xxx); }
main.cpp
Code
#include "foo.h"

void p2() { printf("%d\n", xxx); }

int main()
{
    xxx = 1;
    foo();

    p1();
    p2();

    return 0;
}


Now, in your head, replace "int" with "wxCriticalSection" and "foo.h" with "token.h". Surprise.
"We should forget about small efficiencies, say about 97% of the time: Premature quotation is the root of public humiliation."

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5915
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: using memory pool for the SearchTree Node
« Reply #8 on: July 04, 2009, 03:21:24 pm »
Quote
Now, in your head, replace "int" with "wxCriticalSection" and "foo.h" with "token.h". Surprise.
@thomas
I can understand your meaning. It's quite strange this variable was defined in header file, and use as a static decoration.

I should read more...... That's your signature "Never fix a bug you don’t understand"  :D

Thanks.

Edit

Quote
On top of that the (misnomed) mutex has at least 4 global definitions in different compilation units, which somewhat defeats its purpose as a lock

the token.h was included in these compilation units:

token.cpp
parser.cpp
parserthread.cpp
nativeparser.cpp

I guess the reason of using this kind of mux is that they only want to  lock the operation in it's own compilation units :D Not operation between different units.
« Last Edit: July 04, 2009, 03:50:38 pm by ollydbg »
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline thomas

  • Administrator
  • Lives here!
  • *****
  • Posts: 3979
Re: using memory pool for the SearchTree Node
« Reply #9 on: July 04, 2009, 04:34:31 pm »
I guess the reason of using this kind of mux is that they only want to  lock the operation in it's own compilation units :D Not operation between different units.
To be honest, I don't think this was intentional at all, it just incidentially happens to work without crashing  :)
"We should forget about small efficiencies, say about 97% of the time: Premature quotation is the root of public humiliation."

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5915
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: using memory pool for the SearchTree Node
« Reply #10 on: July 04, 2009, 04:35:01 pm »
@Jens:
I did a more serious test. Both patched version and unpatched build has no difference. Windows XP, Intel DualCore, 2G*2 memory. parsing the whole OpenCV library source.

Patched build test:
Code
Parsing stage done (1211 total parsed files, 60710 tokens in 0 minute(s), 25.172 seconds).

Unpathched build test result:
Code
Parsing stage done (1211 total parsed files, 60710 tokens in 0 minute(s), 25.250 seconds).


So, I just wondering why enable "memory pool" for searchNode has no effect. Maybe, it use too many STL like containers in searchNode class, like:

Code
class SearchTreeNode:public BlockAllocated<SearchTreeNode, 10000>
{
    ......
    protected:
        unsigned int m_depth;
        nSearchTreeNode m_parent;
        nSearchTreeLabel m_label;
        unsigned int m_labelstart, m_labellen;
        SearchTreeLinkMap m_Children;
        SearchTreeItemsMap m_Items;
};

And here is the definition:
/** SearchTreeLinkMap is the list of the edges towards other nodes. The character is the
key, and the node is the value */
typedef map<wxChar,nSearchTreeNode,less<wxChar> > SearchTreeLinkMap;

/** SearchTreeItemsMap contains all the items belonging to an edge */
typedef map<size_t,size_t,less<size_t> > SearchTreeItemsMap;


Maybe, we should supply a optimized allocator for map ?!
« Last Edit: July 04, 2009, 04:40:20 pm by ollydbg »
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5915
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: using memory pool for the SearchTree Node
« Reply #11 on: July 04, 2009, 04:42:21 pm »
I guess the reason of using this kind of mux is that they only want to  lock the operation in it's own compilation units :D Not operation between different units.
To be honest, I don't think this was intentional at all, it just incidentially happens to work without crashing  :)
These codes were horrible :shock:.
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5915
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: using memory pool for the SearchTree Node
« Reply #12 on: July 05, 2009, 02:17:04 pm »
I use a pool allocator supplied by libc++, but still can't gain much on performance, so, I suspect the neck bottle was not in SearchTree node allocation. :(

Some of my test code  is like below:

Code

#include <ext/pool_allocator.h>
......

typedef map<wxChar,nSearchTreeNode,less<wxChar>,__gnu_cxx::__pool_alloc<std::pair<const wxChar,nSearchTreeNode> >  > SearchTreeLinkMap;

typedef map<size_t,size_t,less<size_t>, __gnu_cxx::__pool_alloc<std::pair<const size_t,size_t> > > SearchTreeItemsMap;

If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline oBFusCATed

  • Developer
  • Lives here!
  • *****
  • Posts: 13413
    • Travis build status
Re: using memory pool for the SearchTree Node
« Reply #13 on: July 05, 2009, 05:06:53 pm »
ollydbg: Do you know the first rule of optimizing code? It is: profile the code, so you know the exact place where the slowness comes from!

Though profiling C::B with gprof is impossible or quite hard for a normal human being (extraterrestrials are welcome to do so :) ).
You could manually profile it with wxStopWatch or some other form for doing it manually.
(most of the time I ignore long posts)
[strangers don't send me private messages, I'll ignore them; post a topic in the forum, but first read the rules!]

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5915
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: using memory pool for the SearchTree Node
« Reply #14 on: July 05, 2009, 05:16:32 pm »
ollydbg: Do you know the first rule of optimizing code? It is: profile the code, so you know the exact place where the slowness comes from!

Though profiling C::B with gprof is impossible or quite hard for a normal human being (extraterrestrials are welcome to do so :) ).
You could manually profile it with wxStopWatch or some other form for doing it manually.


Thanks for the hint.

I have read the source of CC, there are some code using wxStopWatch, they even supply a Macro to enable profiling, but It's quite hard to understand fully.. There's no document and even the comment in those code were poor.  :(

Today, I have update a little on the wiki page as a dev manual :D.

By the way ,several days ago, we are discussing parsing string from gdb command "whatis".

 it seems CC works badly in "template", so ,it's hard to support :D

« Last Edit: July 06, 2009, 02:51:34 am by ollydbg »
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.