Author Topic: Revamping Code Completion with little changes  (Read 32036 times)

Offline rickg22

  • Lives here!
  • ****
  • Posts: 2283
Revamping Code Completion with little changes
« on: August 22, 2005, 03:49:35 am »
I got an idea that might just work :)

The reason we're having so many problems with code completion under linux is because of the parserthread class, right?

That's because the ParserThread carries all the information to parse a file.

Well, my idea is to SEPARATE this "parsing information" from the thread, encapsulating it into a single object: ParserJob. The parserthread will be just a container.

This way, if we want to parse a small string, we don't need to create a new thread to parse. We'll just create a ParserJob, and ask it to parse.

Furthermore, with this model, we can create the Parser jobs in a queue ( ParseQueue ) and we'll feed them to whatever ParserThread is currently running. To handle an include, it'll just create a new parser job and add it to the queue (no meddling with wxwidgets events!  8) ). And when the ParserThread finishes parsing, it just searches for a job which will parse the next file.

This way, we don't have to create 100 threads and delete them! :D (I've ran it under linux gdb, the threads aren't deleted instanty, so you can imagine)

And I bet parsing will be a lot faster!

What do you think guys? Yiannis?
« Last Edit: August 22, 2005, 07:13:28 am by rickg22 »

Offline mandrav

  • Project Leader
  • Administrator
  • Lives here!
  • *****
  • Posts: 4315
    • Code::Blocks IDE
Re: Revamping Code Completion with little changes
« Reply #1 on: August 22, 2005, 08:21:04 am »
Sounds nice :)

Yiannis.
Be patient!
This bug will be fixed soon...

Offline me22

  • Official tester
  • Multiple posting newcomer
  • ***
  • Posts: 53
    • TA Universe
Re: Revamping Code Completion with little changes
« Reply #2 on: August 22, 2005, 08:32:53 am »
How about we have a user-settable number of ParserThreads?  This is actually a concurrant-able task, so SMP machines should be able to take advantage of that...

grv575

  • Guest
Re: Revamping Code Completion with little changes
« Reply #3 on: August 22, 2005, 08:48:46 am »
If you're going to go this route (which I think is a good idea), then why not use thread pools?  The way this works is you hand a job/task to a threadpool object which handles allocating a new thread if needed or using an existing thread if enough are already allocated.  So you just tell the pool to execute tasks.  It scales well because the pool uses a configurable limit on the # of threads in the pool.  So once this limit is reached, it queues the task and tasks in the queue run on existing threads in that case.  Anyone know if thread pools are part of standard c++ libs or what would need to be included/written?  But this is the standard/clean way to do this type of thing these days.  That's about all I know about the subject, so... anyone know more about using them (have only used thread pools in .net).

Offline mandrav

  • Project Leader
  • Administrator
  • Lives here!
  • *****
  • Posts: 4315
    • Code::Blocks IDE
Re: Revamping Code Completion with little changes
« Reply #4 on: August 22, 2005, 09:12:43 am »
If you're going to go this route (which I think is a good idea), then why not use thread pools?  The way this works is you hand a job/task to a threadpool object which handles allocating a new thread if needed or using an existing thread if enough are already allocated.  So you just tell the pool to execute tasks.  It scales well because the pool uses a configurable limit on the # of threads in the pool.  So once this limit is reached, it queues the task and tasks in the queue run on existing threads in that case.  Anyone know if thread pools are part of standard c++ libs or what would need to be included/written?  But this is the standard/clean way to do this type of thing these days.  That's about all I know about the subject, so... anyone know more about using them (have only used thread pools in .net).


This is what it currently does, sort of.
There is a stack (or list - can't recall now) where each new thread is put into.
After that, when each thread ends, it pops as many threads from the stack as needed to reach the number of allowed running threads.

Yiannis.
Be patient!
This bug will be fixed soon...

grv575

  • Guest
Re: Revamping Code Completion with little changes
« Reply #5 on: August 22, 2005, 09:27:24 am »
:)

Heh, but that's like exhausting resources first, and then trying to fix the damage later.

Offline mandrav

  • Project Leader
  • Administrator
  • Lives here!
  • *****
  • Posts: 4315
    • Code::Blocks IDE
Re: Revamping Code Completion with little changes
« Reply #6 on: August 22, 2005, 09:32:35 am »
Right, hence this thread ;)

Yiannis.
Be patient!
This bug will be fixed soon...

zieQ

  • Guest
Re: Revamping Code Completion with little changes
« Reply #7 on: August 22, 2005, 01:20:27 pm »
Oh yes, I want a true wxwidget thread pool implementation in C::B core, so as to use it for compiler engine too :roll: (it will speed up compilation with gcc for multiprocs machines as asked in another thread)

Here is a java implementation that we could inspire from:
http://www.informit.com/articles/article.asp?p=30483&seqNum=6
« Last Edit: August 22, 2005, 01:24:13 pm by zieQ »

zieQ

  • Guest
Re: Revamping Code Completion with little changes
« Reply #8 on: August 22, 2005, 02:10:20 pm »
And here is a helpful discussion on performance:
www.scl.ameslab.gov/Publications/ Brett/CCCTFinal-color.pdf

Offline mandrav

  • Project Leader
  • Administrator
  • Lives here!
  • *****
  • Posts: 4315
    • Code::Blocks IDE
Re: Revamping Code Completion with little changes
« Reply #9 on: August 22, 2005, 02:11:50 pm »
OK, ok :)

I will add a thread-pool implementation in the core SDK.

Yiannis.
Be patient!
This bug will be fixed soon...

Offline rickg22

  • Lives here!
  • ****
  • Posts: 2283
Re: Revamping Code Completion with little changes
« Reply #10 on: August 22, 2005, 05:18:24 pm »
Just... be careful... :P

Anyway you might like subclassing my managedthread class for the threadpool. This way we can make all the threads abort immediately (almost) by setting a single flag.

zieQ

  • Guest
Re: Revamping Code Completion with little changes
« Reply #11 on: August 22, 2005, 05:28:59 pm »
Humm forget what I have said: no thread poll for compilation since wxExecute deals with processes and not threads. However, a thread pool could be useful. Be careful, it should be a singleton (one instance).

Offline mandrav

  • Project Leader
  • Administrator
  • Lives here!
  • *****
  • Posts: 4315
    • Code::Blocks IDE
Re: Revamping Code Completion with little changes
« Reply #12 on: August 22, 2005, 06:00:44 pm »
However, a thread pool could be useful. Be careful, it should be a singleton (one instance).

Why a singleton? You could have many different pools (if needed). Why restrict yourself to just one global pool?
One pool for code-completion, one for compiler, etc.

And the compiler could use thread pooling. Each wxExecute could be executed in a thread pool worker class...

Yiannis.
Be patient!
This bug will be fixed soon...

Offline rickg22

  • Lives here!
  • ****
  • Posts: 2283
Re: Revamping Code Completion with little changes
« Reply #13 on: August 22, 2005, 06:33:36 pm »
And don't forget Search-in-files and TODO list! :D

OK, but if the code completion is going to use the thread pool, how long will I have to wait before the pool's ready so I can make the changes to CC? Or should I begin the changes and later we can switch to using the pool?
« Last Edit: August 22, 2005, 06:35:22 pm by rickg22 »

Offline mandrav

  • Project Leader
  • Administrator
  • Lives here!
  • *****
  • Posts: 4315
    • Code::Blocks IDE
Re: Revamping Code Completion with little changes
« Reply #14 on: August 22, 2005, 06:44:35 pm »
And don't forget Search-in-files and TODO list! :D

OK, but if the code completion is going to use the thread pool, how long will I have to wait before the pool's ready so I can make the changes to CC? Or should I begin the changes and later we can switch to using the pool?

I said I 'm doing it. Stop being impatient ;)
I guess it 'll take a day or two.

Yiannis.
Be patient!
This bug will be fixed soon...

zieQ

  • Guest
Re: Revamping Code Completion with little changes
« Reply #15 on: August 22, 2005, 06:53:40 pm »
Quote from: mandrav
Why a singleton? You could have many different pools (if needed). Why restrict yourself to just one global pool?
One pool for code-completion, one for compiler, etc.

And the compiler could use thread pooling. Each wxExecute could be executed in a thread pool worker class...
Yiannis.

Well, singleton due to the pool dimensioning. There is an optimal value for the number of threads for the best performance that would not be guaranteed if there are many pools, especially if the dimension is allowed to vary during runtime by analyzing the workload. Did you read the second link I posted? I may be wrong however, we may propose a simplified version with no adjustement of the number of threads but I don't think that many pools is a good idea anyway. To me, thread pools are a collection of threads (optimal number dependending on the machine and the type of workload, which vary during runtime in our case) and a collection of (independent) tasks to be accomplished by the given number of threads. If tasks are independent, there's no need to separate pools for each kind of task to perform. We then can mix parsing with compilation, etc.

Second remark, I don't see any way for the compiler to use the thread pool. AFAIK, there is no method that link a wxThread to a wxProcess and conversely, and wxExecute deals with wxProcess while the thread pools deals with wxThread. Can you explain how you would do it? Maybe I am blind :p

Quote from: rickg22
OK, but if the code completion is going to use the thread pool, how long will I have to wait before the pool's ready so I can make the changes to CC? Or should I begin the changes and later we can switch to using the pool?

I think we can define the public interface fast for the pool so you can start redesigning it before the pool is implemented... Just my thought...


Offline rickg22

  • Lives here!
  • ****
  • Posts: 2283
Re: Revamping Code Completion with little changes
« Reply #16 on: August 22, 2005, 07:14:53 pm »
AHHHHHHH so something like...

Code
class ParserTask: public ThreadTask
{
};

?

Offline mandrav

  • Project Leader
  • Administrator
  • Lives here!
  • *****
  • Posts: 4315
    • Code::Blocks IDE
Re: Revamping Code Completion with little changes
« Reply #17 on: August 22, 2005, 08:21:38 pm »
OK, here's a *primitive* I 'm working on. It's nowhere near complete API-wise. But hey, let's add some flesh and bones in the discussion ;)
(don't mind the STL in there - we 'll probably use wx replacements for those)

Code
#ifndef CBTHREADPOOL_H
#define CBTHREADPOOL_H

#include <queue>
#include <vector>
#include <wx/event.h>
#include <wx/thread.h>

extern int idThreadPoolMsg;

// flags
#define DELETE_WORKER_WHEN_DONE 0x01

/// Abstract base class for workers assigned to the thread pool object
/// Derive a class from this and perform your thing in Execute()
/// It's like wxThread, except it's not a thread anymore ;)
class cbThreadPoolWorker
{
public:
virtual int Execute() = 0;
};

// forward decl
class PrivateThread;

/// Definition of  a Thread Pool Object (used by thread pool only)
struct cbThreadPoolObject
{
    cbThreadPoolObject() : pThread(0), pWorker(0), flags(0) {}
    cbThreadPoolObject(const cbThreadPoolObject& rhs) : pThread(rhs.pThread), pWorker(rhs.pWorker), flags(rhs.flags) {}
    PrivateThread* pThread;
    cbThreadPoolWorker* pWorker;
    int flags;
};

/// The main thread pool object.
class cbThreadPool
{
    public:
        cbThreadPool(wxEvtHandler* owner, int id = -1, int concurrentThreads = -1);
        virtual ~cbThreadPool();

        virtual void SetConcurrentThreads(int concurrentThreads);
        virtual bool AddThread(cbThreadPoolWorker* worker, bool deleteWhenDone = false);
        virtual void RemoveThread(cbThreadPoolWorker* worker);
    protected:
        virtual void AllocThreads();
        virtual void FreeThreads();
        virtual void ScheduleThreads();
        void Log(const wxString& msg);

        wxEvtHandler* m_pOwner;
        int m_ID;
        std::queue<cbThreadPoolObject> m_Pool;
        std::vector<PrivateThread*> m_Workers; // worker threads
        int m_ConcurrentThreads;
        int m_MaxThreads;
};

/// Base thread class
class PrivateThread : public wxThread
{
public:
        enum State
        {
            Idle,
            Busy
        };
        enum Status
        {
            LaunchJob,
            Quit
        };
PrivateThread(cbThreadPoolWorker* worker = 0)
            : m_Condition(m_Mutex),
            m_pWorker(worker),
            m_Status(Quit),
            m_State(Idle)
        {
            m_Mutex.Lock();
        }
~PrivateThread(){}

void SetWorker(cbThreadPoolWorker* worker);
void SetStatus(Status status);
State GetState(){ return m_State; }
virtual ExitCode Entry();
protected:
        friend class cbThreadPool; // allow thread pool to access the mutex and the condition object
wxMutex m_Mutex;
wxCondition m_Condition;
cbThreadPoolWorker* m_pWorker;
Status m_Status;
State m_State;
private:
};

#endif // CBTHREADPOOL_H

The way it would work is like:

Code
class ParserThread : public cbThreadPoolWorker
{
    ...
    int Execute(); // override pure virtual
    ...
};

// set the max nr of concurrent threads to run
// -1 means "number of CPUs" (yes it works with HyperThreading ;) )
g_ThreadPool.SetConcurrentThreads(-1);

// pass a new "thread" to the pool; it 'll run when the pool has available resources
g_ThreadPool.AddThread(new ParserThread);

Yiannis.
Be patient!
This bug will be fixed soon...

Offline Urxae

  • Regular
  • ***
  • Posts: 376
Re: Revamping Code Completion with little changes
« Reply #18 on: August 22, 2005, 09:31:54 pm »
Code
// flags
#define DELETE_WORKER_WHEN_DONE 0x01

/// Abstract base class for workers assigned to the thread pool object
/// Derive a class from this and perform your thing in Execute()
/// It's like wxThread, except it's not a thread anymore ;)
class cbThreadPoolWorker
{
public:
virtual int Execute() = 0;
};
If you're going to call delete on pointers to workers (as that #define implies) you should give them a virtual destructor, so that the derived class gets to run its own constructor as well ;).

Offline mandrav

  • Project Leader
  • Administrator
  • Lives here!
  • *****
  • Posts: 4315
    • Code::Blocks IDE
Re: Revamping Code Completion with little changes
« Reply #19 on: August 22, 2005, 09:38:15 pm »
If you're going to call delete on pointers to workers (as that #define implies) you should give them a virtual destructor, so that the derived class gets to run its own constructor as well ;).

Right. Although it's early, I can't believe I 've missed it (you 'll see it in cbThreadPool).
Thanks :)

Yiannis.
Be patient!
This bug will be fixed soon...

Offline Urxae

  • Regular
  • ***
  • Posts: 376
Re: Revamping Code Completion with little changes
« Reply #20 on: August 22, 2005, 09:52:48 pm »
Code
#define DELETE_WORKER_WHEN_DONE 0x01
[...]
        virtual bool AddThread(cbThreadPoolWorker* worker, bool deleteWhenDone = false);
I just now saw that second line. What is the purpose of the #define if there is a parameter that seems to specify the exact same thing?

And you might want to rename AddThread and RemoveThread to something like AddJob and RemoveJob. From what I understand of thread pools you're not adding threads but jobs (or perhaps workers, but jobs sounds better ;)) which are then perfomred/executed by the pooled threads.

zieQ

  • Guest
Re: Revamping Code Completion with little changes
« Reply #21 on: August 22, 2005, 10:00:28 pm »
I would rename cbThreadPoolWorker to cbThreadedTask (or cbTask) since thread pool worker may be misunderstood as the thread that performs the task. For me, a thread live for long, whereas a task is short, but it is cosmetics. Moreover, I would hide the implementation details (PrivateThread, cbThreadPoolObject) in the cpp file and just keep forward declarations if necessary. And I don't like the #define CONSTANT...

zieQ

  • Guest
Re: Revamping Code Completion with little changes
« Reply #22 on: August 22, 2005, 10:01:08 pm »
Humm, a job is a list of tasks :p

Offline mandrav

  • Project Leader
  • Administrator
  • Lives here!
  • *****
  • Posts: 4315
    • Code::Blocks IDE
Re: Revamping Code Completion with little changes
« Reply #23 on: August 22, 2005, 10:24:11 pm »
You have to realize this is just a draft which I 'm working on. It's nowhere near completion and many things will change and/or move around.
I didn't post it so that Rick or anyone else could start working on revamping cc, but just for discussion...

Code
#define DELETE_WORKER_WHEN_DONE 0x01
[...]
 virtual bool AddThread(cbThreadPoolWorker* worker, bool deleteWhenDone = false);
I just now saw that second line. What is the purpose of the #define if there is a parameter that seems to specify the exact same thing?

See? That's what I 'm talking about. Just to avoid the confusion, the AddThread()'s "delete" parameter tells cbThreadPool that this "thread" should be deleted on completion. The DELETE_WORKER_WHEN_DONE is set in cbThreadPoolObject.flags for this "thread" so that when it is done, it knows to delete it. In other words, these are two different steps of the process...
Instead of having " bool cbThreadPoolObject::deleteWhenDone;", we have a "int cbThreadPoolObject::flags;" as a bitmask...

Yiannis.
Be patient!
This bug will be fixed soon...

Offline rickg22

  • Lives here!
  • ****
  • Posts: 2283
Re: Revamping Code Completion with little changes
« Reply #24 on: August 22, 2005, 10:24:46 pm »
I'd suggest using ManagedThread instead of wxThread... after all i created those for a reason :)
Unless there are objections of course...

grv575

  • Guest
Re: Revamping Code Completion with little changes
« Reply #25 on: August 22, 2005, 10:59:48 pm »
I don't think it should be a singleton.  The pool of threads executes a queue of tasks, so independent modules (compile, code completion) would not want to wait to even start to execute if the other module has all the available threads occupied...

So multiple thread pools looks needed.  Is it possible to do something like:
#define THREADPOOL_COUNT 2

cpus = g_ThreadPool.GetCpuCount();
g_ThreadPool.SetConcurrentThreads(cpus / THREADPOOL_COUNT);

Offline Urxae

  • Regular
  • ***
  • Posts: 376
Re: Revamping Code Completion with little changes
« Reply #26 on: August 22, 2005, 11:04:14 pm »
So multiple thread pools looks needed.  Is it possible to do something like:
#define THREADPOOL_COUNT 2

cpus = g_ThreadPool.GetCpuCount();
g_ThreadPool.SetConcurrentThreads(cpus / THREADPOOL_COUNT);
You do realize that would round to 0 if there's only one CPU, right? :?

zieQ

  • Guest
Re: Revamping Code Completion with little changes
« Reply #27 on: August 22, 2005, 11:38:46 pm »
I don't think it should be a singleton.  The pool of threads executes a queue of tasks, so independent modules (compile, code completion) would not want to wait to even start to execute if the other module has all the available threads occupied...

Well, could I suggest you to read to links I provided? If there are too many threads, even if you dispose of another pool, your tasks will wait for the processors to be available since they are already busy with the other pool btw, and you will also have a performance penalty. One thread pool make it easier to get optimal performance whatever the task could be (compile, code completion). Remember, the thread pool has a queue of tasks: if you don't flood the pool with tasks, independent modules could execute concurrently with maximum performance. We may assign a priority to tasks but that's not that easy to make it work well. Your solution is worse: you dispose of half the computing power for each pool, with half worker threads going idle most of the time due to inactivity of the pool! Nonsense.

Mandrav, I don't see why we may delete tasks since they are supposed to be executed for a short time? It's not safe I think, since once the task is completed, the pointer could have been deleted!

grv575

  • Guest
Re: Revamping Code Completion with little changes
« Reply #28 on: August 23, 2005, 06:16:47 am »
Quote
Your solution is worse: you dispose of half the computing power for each pool, with half worker threads going idle most of the time due to inactivity of the pool! Nonsense.

Yeah, I wasn't thinking.  I guess only 1 pool makes sense.  It's not like there's any other tasks besides compilation that should require long completion times anyway.

Quote
Mandrav, I don't see why we may delete tasks since they are supposed to be executed for a short time? It's not safe I think, since once the task is completed, the pointer could have been deleted!

You're going to want some way to signal the thread to terminate safely to cleanup resources (like say when the user closes the app).

Offline rickg22

  • Lives here!
  • ****
  • Posts: 2283
Re: Revamping Code Completion with little changes
« Reply #29 on: August 23, 2005, 06:36:27 am »
You're going to want some way to signal the thread to terminate safely to cleanup resources (like say when the user closes the app).

ManagedThread does that :D. It keeps a static flag, and each class can point to an additional flag that tells them to terminate.

zieQ

  • Guest
Re: Revamping Code Completion with little changes
« Reply #30 on: August 23, 2005, 08:30:59 am »
Yes, but what is the problem if you let the tasks ending naturally? (I still suppose that tasks are living for a short time heh). Anyway I think there is a potential bug with the way delete of tasks is implemented!

Mandrav, you didn't answer my question about wxExecute and threads: is there a way to execute a command in a thread instead of a process? I can't find any... In fact, I found out that C::B is very resource consuming when used as a background task (compared to MSVC), I'm wondering if it's not related to the way compiler commands are executed (in processes instead of threads)...

Offline thomas

  • Administrator
  • Lives here!
  • *****
  • Posts: 3979
Re: Revamping Code Completion with little changes
« Reply #31 on: August 23, 2005, 09:27:21 am »
wxExecute must not be used from anything but the main thread. First, it will rise an assertion if you try to, and second, wxExecute calls wxYield() at the most impossible of times, possibly causing recursive Yields.
wxExecute always runs on processes, but this is really not the reason why it is so inefficient. The main reason why it is so bad is that you must regularly Yield or you will never get anything from the pipe. The pipe will fill up and the child process will block. Consequently, your application will block, too, waiting for the child to die.
There are two modes of operation for wxExecute, one is while(!done){wxYield();} - this is what wxExecute does internally when running synchronously, it is *horribly* inefficient. The other way (the recommended way for asynchronous execution, and also the way it is done in code::blocks) is to use  a wxTimer, which wastes less CPU, but means that no matter what you do, the Timer's minimum latency (around 50ms) plus the time needed to pass messages through the wx message queue (50-250ms?) always add to the execution time of your process. If you spawn off 200 processes, this can be quite a few seconds. There is nothing you can do, though, it is like this by design.
Apart from wxExecute, there is no way to run a command in a thread, either. It obviously has to be process (except if you load the executable image like a dll and do some hokuspokus maybe, but let's not even think about that...). However since the question was about Linux originally, process and thread is not really a different thing.

I would also advocate to be really gratious with threads (i.e. have several thread pools, and keep them sleeping rather than ending them). Thread scheduling is O(0) under Linux, and although the Windows scheduler sucks, it is still better to keep a few threads sleeping than spawn new ones regularly, as this is a lot more expensive.
I would also not make the thread pool a singleton (it is tempting, admitted!) because a thread pool is a queue, so if it is a singleton, you may get undesirable effects. For example if you decide to run "search in files" in its own thread, then it will have to wait until the last syntax parsing job is finished - this does not make sense since these are not related tasks.

A singleton with a priority queue might be a consideration. You could queue "interactive" tasks to the queue with a higher priority. However, this is a lot more complicated (keep things simple!), and the management of the priority queue is probably more expensive than just spawning two or three more threads.

When advocating many threads, one important thing to keep in mind is memory. If you spawn off 50 threads with 1 MB of stack each, then you need 50 MB of memory only for these, so obviously the total number should not be a lot more than maybe 6-8.


"We should forget about small efficiencies, say about 97% of the time: Premature quotation is the root of public humiliation."

Offline mandrav

  • Project Leader
  • Administrator
  • Lives here!
  • *****
  • Posts: 4315
    • Code::Blocks IDE
Re: Revamping Code Completion with little changes
« Reply #32 on: August 23, 2005, 09:34:33 am »
Mandrav, you didn't answer my question about wxExecute and threads: is there a way to execute a command in a thread instead of a process? I can't find any... In fact, I found out that C::B is very resource consuming when used as a background task (compared to MSVC), I'm wondering if it's not related to the way compiler commands are executed (in processes instead of threads)...

Unfortunately there is no way. This is the root of the problem (from wx docs):

Quote
NB: Currently wxExecute() can only be used from the main thread, calling this function from another thread will result in an assert failure in debug build and won't work.

Yiannis.
Be patient!
This bug will be fixed soon...

grv575

  • Guest
Re: Revamping Code Completion with little changes
« Reply #33 on: August 23, 2005, 09:44:29 am »
So why not make a couple of thread pools, each one for a specific cb component (compiler, code completion, other plugins).  Have each pool allocate (by default) the same # of threads as cpus/cores.  Now the only thing you don't want is to have more *resource intensive* threads than the number of cpus.  If there are 2 cpus and only two intensive threads (with code completion and ui events not running for very long periods of time) then everything is optimal.  So really, there shouldn't be any problem with > 1 pool, considering what CB does in terms of processing.

zieQ

  • Guest
Re: Revamping Code Completion with little changes
« Reply #34 on: August 23, 2005, 11:31:09 am »
The term optimality I used previously refer to the article that nobody seems to have read, not a guess of what would be the better way to implement it and a guess on the number of worker threads to use. I'll stop making some comments if nobody read what I say :?

As I said,
- no need for many pools, it will degrade performance anyway: if one of the pool is already consuming all the processing power, the other pool will have some difficulties to run, and otherwise there will be many thread idle which consume (a little but...) processing power
- the number of thread should depend on the workload and workload type, not a fixed value which provide good performance for somebody but bad for others

Hope somebody is going to read the article from the link I posted above :x

So I agree about the wxExecute, and we will not be able to use the pool to perform compilation tasks
« Last Edit: August 23, 2005, 11:41:28 am by zieQ »

Offline mandrav

  • Project Leader
  • Administrator
  • Lives here!
  • *****
  • Posts: 4315
    • Code::Blocks IDE
Re: Revamping Code Completion with little changes
« Reply #35 on: August 23, 2005, 11:58:02 am »
Update:

Code
#ifndef CBTHREADPOOL_H
#define CBTHREADPOOL_H

#include <queue>
#include <vector>
#include <wx/event.h>
#include <wx/thread.h>

/// Abstract base class for workers assigned to the thread pool object
/// Derive a class from this and perform your thing in Execute()
/// It's like wxThread...
class cbThreadPoolTask
{
public:
        virtual ~cbThreadPoolTask(){}
virtual int Execute() = 0;
};

// forward decl
class PrivateThread;

/// The main thread pool object.
/// If you set concurrent threads to -1, it will use the number of CPUs present ;)
class cbThreadPool : public wxEvtHandler
{
    public:
        cbThreadPool(wxEvtHandler* owner, int id = -1, int concurrentThreads = -1);
        virtual ~cbThreadPool();

        virtual void SetConcurrentThreads(int concurrentThreads);
        virtual bool AddTask(cbThreadPoolTask* worker);
    protected:
        virtual void AllocThreads();
        virtual void FreeThreads();
        virtual void ScheduleTasks();
        virtual void ScheduleNextTask(PrivateThread* thr);
        void OnThreadTaskDone(wxCommandEvent& event);

        wxEvtHandler* m_pOwner;
        int m_ID;
        std::queue<cbThreadPoolTask*> m_Pool;
        std::vector<PrivateThread*> m_Workers; // worker threads
        int m_ConcurrentThreads;
        int m_MaxThreads;

        DECLARE_EVENT_TABLE();
};

#endif // CBTHREADPOOL_H

It's already working in a test app :)

Yiannis.
Be patient!
This bug will be fixed soon...

Offline thomas

  • Administrator
  • Lives here!
  • *****
  • Posts: 3979
Re: Revamping Code Completion with little changes
« Reply #36 on: August 23, 2005, 02:35:57 pm »
The reason nobody reads the article is maybe that the link is broken (there is a space in front of Brett).

But apart from that article (which is a nice analysis of heavy-duty threading by the way), there may be good reasons why one would really want to have two or more pools. I am not saying one must, but one should at least consider.
Two pools can for example run at different priorities, so one does the "realtime" jobs, and the other fills the CPU when it is idle with whatever has to be done. It does not really matter if parsing files for code completioin takes a second or two longer, the user will never notice. Neither does it matter if something else takes half a second longer.
What the user will notice, however, is if the application does not feel interactive. If you click onto something and it takes half a second before it actually starts running because it is added to the tail of the job queue, then this is something very noticeable. With two pools, this is not a problem at all.

Regarding many threads, it is true that context switches may degrade performance really badly. Running 20 heavy number-crunching threads on a single CPU will certainly get you nowhere.
However, it is is simply not true that sleeping (or blocking) threads take away significant amounts of CPU, not even on Windows. We are talking about 4 or 5 threads alltogether. My PC has 382 threads running right now, and except while typing, all CPU time (3%) goes into task manager. If there are 5 more threads, there will be no difference at all. However, creating new threads is really really expensive compared to that (on Windows, at least).

Even with context switches in mind, having one or two "extra" threads may be an advantage because then one thread can fully utilize the CPU while the other is blocking (waiting until a DMA operation finishes, for example). Since many tasks in an IDE involve disk access, this is likely to happen, too.

Alltogether, it is not an easy decision, but I think that since threads are really cheap, one can be gratious with them in this case. Rather have 2 pools than one, and rather have rather 5 threads each than one or two. Since we are not going to factorize a 300-digit number or something, it should really not matter regarding context switches - most of the time our threads will sleep anyway.
"We should forget about small efficiencies, say about 97% of the time: Premature quotation is the root of public humiliation."

grv575

  • Guest
Re: Revamping Code Completion with little changes
« Reply #37 on: August 23, 2005, 03:37:58 pm »
The term optimality I used previously refer to the article that nobody seems to have read, not a guess of what would be the better way to implement it and a guess on the number of worker threads to use. I'll stop making some comments if nobody read what I say :?

I did read the article.  It's a good reference but very academic.  A much better intro to multiprocessing and thread re performance considerations is http://www.open-mag.com/features/Vol_116/NUMA/NUMA.htm
It gives rules of thumb for this type of stuff (allocate as many intensive, conncurrent threads as you have cpus [or cpu cores]).  Which is the accepted practice.  Analyzing the pdf you linked: the parameter they synthesized and benchmarked at values of 10, 50, 100 give the amount of idle time the thread had..allowing other threads to be scheduled before the time slice is up.  100 is unrealistic, these threads are allocated and slept right away, which explains the linear scaling.  At 50%, these are semi-intensive.  And at 10% (90% of the timeslice is used), we only get improvements by using 2 threads in the pool vs. 1.  This is because the test system had 1 cpu (don't remember if hyperthreading was enabled).  Apparently scheduling was still more efficient by using 2 threads vs. 1 but I suspect it was a hyperthreading cpu (could always check the article ;) ).

Now the issue is that most of the threads in the pool will have something like 3-5% timeslice utilization, except for the compile and link threads which are cpu-intensive.  Therefore, setting each pool to use the # of threads = to the # cpus/cores (accounting for hyperthreading of course) would work very well.  Only the compilation pool would be extremely busy.  And this would also keep the total thread count low as well.  After all how many total cpus do you have (dual core would mean ~6 threads accross 3 pools for example)?

Quote
As I said,
- no need for many pools, it will degrade performance anyway: if one of the pool is already consuming all the processing power, the other pool will have some difficulties to run, and otherwise there will be many thread idle which consume (a little but...) processing power

W/o many pools you run into the task queuing, and UI responsiveness issues that thomas summaried adequately.

Quote
- the number of thread should depend on the workload and workload type, not a fixed value which provide good performance for somebody but bad for others

Does not seem to be the case.  A configuration option where the user can choose how many threads they would like to use is a good idea.  And the default should definately depend on the # of cpus on the system.  Because this is optimal.


Offline rickg22

  • Lives here!
  • ****
  • Posts: 2283
Re: Revamping Code Completion with little changes
« Reply #38 on: August 23, 2005, 05:43:07 pm »
As long as threads are running in the background, there's no consequence that the user could feel (i.e. "my app froze while searching!" ).

The reason parsing kinda freezes codeblocks for a second when you save a file, at least when no wxYield is inserted in the middle of the app, is because it depends on events (which are handled in the MAIN thread) to do the further processing. But if we implement the pooling, there won't be any problem with that.

Anyway, Ifigured out how to avoid the queuing overhead.

You have TWO queues, and a "scheduler thread", dedicated 100% to queuing:

Passive task queue (shared by main thread and scheduling thread) [ pending tasks]
Active task queue (shared by scheduling and worker threads) [about-to-run tasks]

The passive and active task queues use DIFFERENT MUTEXES so they can be handled asynchronously.

The "scheduler thread" would remove the task from the passive queue, and keep a copy in memory for itself (if the queues use pointers rather than objects, much better ;-) ). Then it releases the "passive mutex" so it can start scheduling. This may take more time as the timeslice is very crowded, but since the passive mutex has been released, the GUI won't have to deal with it. All it knows is that the job's being put in the queue now.

Offline rickg22

  • Lives here!
  • ****
  • Posts: 2283
Yiannis, well done! :D
« Reply #39 on: September 03, 2005, 02:19:33 am »
You did an *excellent* work with the Code completion!

I have a question tho...

Why keep the name "ParserThread" if it's not a Thread anymore? It's a bit confusing... but anyway, my kudos to you, in 48 hours you've done wonders with Codeblocks!  8)

Congratulations!
« Last Edit: September 03, 2005, 03:34:38 am by rickg22 »

Offline mandrav

  • Project Leader
  • Administrator
  • Lives here!
  • *****
  • Posts: 4315
    • Code::Blocks IDE
Re: Yiannis, well done! :D
« Reply #40 on: September 03, 2005, 12:03:57 pm »
Why keep the name "ParserThread" if it's not a Thread anymore? It's a bit confusing... but anyway, my kudos to you, in 48 hours you've done wonders with Codeblocks!  8)

Well, technically it is still ran as a thread (although it doesn't descend from wxThread anymore).
But anyway, I would have to rename the files too so it would be too much of a hassle (to preserve CVS history)...

What I like a lot about it is that it is *so* much more responsive now :D (not to mention *much* less resource hungry)
Be patient!
This bug will be fixed soon...

Offline rickg22

  • Lives here!
  • ****
  • Posts: 2283
Re: Revamping Code Completion with little changes
« Reply #41 on: September 05, 2005, 01:42:58 am »
One last question... why the "Please wait while blahblahblah" in the code completion? It's kinda annoying seeing it everytime.

Offline mandrav

  • Project Leader
  • Administrator
  • Lives here!
  • *****
  • Posts: 4315
    • Code::Blocks IDE
Re: Revamping Code Completion with little changes
« Reply #42 on: September 05, 2005, 08:44:58 am »
One last question... why the "Please wait while blahblahblah" in the code completion? It's kinda annoying seeing it everytime.

I will add a configuration option for it.
It was added because while loading/saving cache it might take a while and first-time users better know what's going on and not think something's wrong...
Be patient!
This bug will be fixed soon...

Offline rickg22

  • Lives here!
  • ****
  • Posts: 2283
Re: Revamping Code Completion with little changes
« Reply #43 on: September 05, 2005, 09:04:20 pm »
Hmmmm.... any possibility of doing that caché loading from a threads? (I mean, the parsing is done in threads, why should it be any different?)

Offline mandrav

  • Project Leader
  • Administrator
  • Lives here!
  • *****
  • Posts: 4315
    • Code::Blocks IDE
Re: Revamping Code Completion with little changes
« Reply #44 on: September 05, 2005, 09:39:08 pm »
I can't do everything at the same time :P
Eventually, yes, it will be moved in a thread but now we 're on our way for rc2. This can wait.
Be patient!
This bug will be fixed soon...

Offline rickg22

  • Lives here!
  • ****
  • Posts: 2283
Re: Revamping Code Completion with little changes
« Reply #45 on: September 06, 2005, 01:47:56 am »
Oh, ok then :)

(Gee, i'm getting excited... so many changes...I wonder how cool 1.1 will be :D