Author Topic: Revamping Code Completion with little changes  (Read 32008 times)

zieQ

  • Guest
Re: Revamping Code Completion with little changes
« Reply #30 on: August 23, 2005, 08:30:59 am »
Yes, but what is the problem if you let the tasks ending naturally? (I still suppose that tasks are living for a short time heh). Anyway I think there is a potential bug with the way delete of tasks is implemented!

Mandrav, you didn't answer my question about wxExecute and threads: is there a way to execute a command in a thread instead of a process? I can't find any... In fact, I found out that C::B is very resource consuming when used as a background task (compared to MSVC), I'm wondering if it's not related to the way compiler commands are executed (in processes instead of threads)...

Offline thomas

  • Administrator
  • Lives here!
  • *****
  • Posts: 3979
Re: Revamping Code Completion with little changes
« Reply #31 on: August 23, 2005, 09:27:21 am »
wxExecute must not be used from anything but the main thread. First, it will rise an assertion if you try to, and second, wxExecute calls wxYield() at the most impossible of times, possibly causing recursive Yields.
wxExecute always runs on processes, but this is really not the reason why it is so inefficient. The main reason why it is so bad is that you must regularly Yield or you will never get anything from the pipe. The pipe will fill up and the child process will block. Consequently, your application will block, too, waiting for the child to die.
There are two modes of operation for wxExecute, one is while(!done){wxYield();} - this is what wxExecute does internally when running synchronously, it is *horribly* inefficient. The other way (the recommended way for asynchronous execution, and also the way it is done in code::blocks) is to use  a wxTimer, which wastes less CPU, but means that no matter what you do, the Timer's minimum latency (around 50ms) plus the time needed to pass messages through the wx message queue (50-250ms?) always add to the execution time of your process. If you spawn off 200 processes, this can be quite a few seconds. There is nothing you can do, though, it is like this by design.
Apart from wxExecute, there is no way to run a command in a thread, either. It obviously has to be process (except if you load the executable image like a dll and do some hokuspokus maybe, but let's not even think about that...). However since the question was about Linux originally, process and thread is not really a different thing.

I would also advocate to be really gratious with threads (i.e. have several thread pools, and keep them sleeping rather than ending them). Thread scheduling is O(0) under Linux, and although the Windows scheduler sucks, it is still better to keep a few threads sleeping than spawn new ones regularly, as this is a lot more expensive.
I would also not make the thread pool a singleton (it is tempting, admitted!) because a thread pool is a queue, so if it is a singleton, you may get undesirable effects. For example if you decide to run "search in files" in its own thread, then it will have to wait until the last syntax parsing job is finished - this does not make sense since these are not related tasks.

A singleton with a priority queue might be a consideration. You could queue "interactive" tasks to the queue with a higher priority. However, this is a lot more complicated (keep things simple!), and the management of the priority queue is probably more expensive than just spawning two or three more threads.

When advocating many threads, one important thing to keep in mind is memory. If you spawn off 50 threads with 1 MB of stack each, then you need 50 MB of memory only for these, so obviously the total number should not be a lot more than maybe 6-8.


"We should forget about small efficiencies, say about 97% of the time: Premature quotation is the root of public humiliation."

Offline mandrav

  • Project Leader
  • Administrator
  • Lives here!
  • *****
  • Posts: 4315
    • Code::Blocks IDE
Re: Revamping Code Completion with little changes
« Reply #32 on: August 23, 2005, 09:34:33 am »
Mandrav, you didn't answer my question about wxExecute and threads: is there a way to execute a command in a thread instead of a process? I can't find any... In fact, I found out that C::B is very resource consuming when used as a background task (compared to MSVC), I'm wondering if it's not related to the way compiler commands are executed (in processes instead of threads)...

Unfortunately there is no way. This is the root of the problem (from wx docs):

Quote
NB: Currently wxExecute() can only be used from the main thread, calling this function from another thread will result in an assert failure in debug build and won't work.

Yiannis.
Be patient!
This bug will be fixed soon...

grv575

  • Guest
Re: Revamping Code Completion with little changes
« Reply #33 on: August 23, 2005, 09:44:29 am »
So why not make a couple of thread pools, each one for a specific cb component (compiler, code completion, other plugins).  Have each pool allocate (by default) the same # of threads as cpus/cores.  Now the only thing you don't want is to have more *resource intensive* threads than the number of cpus.  If there are 2 cpus and only two intensive threads (with code completion and ui events not running for very long periods of time) then everything is optimal.  So really, there shouldn't be any problem with > 1 pool, considering what CB does in terms of processing.

zieQ

  • Guest
Re: Revamping Code Completion with little changes
« Reply #34 on: August 23, 2005, 11:31:09 am »
The term optimality I used previously refer to the article that nobody seems to have read, not a guess of what would be the better way to implement it and a guess on the number of worker threads to use. I'll stop making some comments if nobody read what I say :?

As I said,
- no need for many pools, it will degrade performance anyway: if one of the pool is already consuming all the processing power, the other pool will have some difficulties to run, and otherwise there will be many thread idle which consume (a little but...) processing power
- the number of thread should depend on the workload and workload type, not a fixed value which provide good performance for somebody but bad for others

Hope somebody is going to read the article from the link I posted above :x

So I agree about the wxExecute, and we will not be able to use the pool to perform compilation tasks
« Last Edit: August 23, 2005, 11:41:28 am by zieQ »

Offline mandrav

  • Project Leader
  • Administrator
  • Lives here!
  • *****
  • Posts: 4315
    • Code::Blocks IDE
Re: Revamping Code Completion with little changes
« Reply #35 on: August 23, 2005, 11:58:02 am »
Update:

Code
#ifndef CBTHREADPOOL_H
#define CBTHREADPOOL_H

#include <queue>
#include <vector>
#include <wx/event.h>
#include <wx/thread.h>

/// Abstract base class for workers assigned to the thread pool object
/// Derive a class from this and perform your thing in Execute()
/// It's like wxThread...
class cbThreadPoolTask
{
public:
        virtual ~cbThreadPoolTask(){}
virtual int Execute() = 0;
};

// forward decl
class PrivateThread;

/// The main thread pool object.
/// If you set concurrent threads to -1, it will use the number of CPUs present ;)
class cbThreadPool : public wxEvtHandler
{
    public:
        cbThreadPool(wxEvtHandler* owner, int id = -1, int concurrentThreads = -1);
        virtual ~cbThreadPool();

        virtual void SetConcurrentThreads(int concurrentThreads);
        virtual bool AddTask(cbThreadPoolTask* worker);
    protected:
        virtual void AllocThreads();
        virtual void FreeThreads();
        virtual void ScheduleTasks();
        virtual void ScheduleNextTask(PrivateThread* thr);
        void OnThreadTaskDone(wxCommandEvent& event);

        wxEvtHandler* m_pOwner;
        int m_ID;
        std::queue<cbThreadPoolTask*> m_Pool;
        std::vector<PrivateThread*> m_Workers; // worker threads
        int m_ConcurrentThreads;
        int m_MaxThreads;

        DECLARE_EVENT_TABLE();
};

#endif // CBTHREADPOOL_H

It's already working in a test app :)

Yiannis.
Be patient!
This bug will be fixed soon...

Offline thomas

  • Administrator
  • Lives here!
  • *****
  • Posts: 3979
Re: Revamping Code Completion with little changes
« Reply #36 on: August 23, 2005, 02:35:57 pm »
The reason nobody reads the article is maybe that the link is broken (there is a space in front of Brett).

But apart from that article (which is a nice analysis of heavy-duty threading by the way), there may be good reasons why one would really want to have two or more pools. I am not saying one must, but one should at least consider.
Two pools can for example run at different priorities, so one does the "realtime" jobs, and the other fills the CPU when it is idle with whatever has to be done. It does not really matter if parsing files for code completioin takes a second or two longer, the user will never notice. Neither does it matter if something else takes half a second longer.
What the user will notice, however, is if the application does not feel interactive. If you click onto something and it takes half a second before it actually starts running because it is added to the tail of the job queue, then this is something very noticeable. With two pools, this is not a problem at all.

Regarding many threads, it is true that context switches may degrade performance really badly. Running 20 heavy number-crunching threads on a single CPU will certainly get you nowhere.
However, it is is simply not true that sleeping (or blocking) threads take away significant amounts of CPU, not even on Windows. We are talking about 4 or 5 threads alltogether. My PC has 382 threads running right now, and except while typing, all CPU time (3%) goes into task manager. If there are 5 more threads, there will be no difference at all. However, creating new threads is really really expensive compared to that (on Windows, at least).

Even with context switches in mind, having one or two "extra" threads may be an advantage because then one thread can fully utilize the CPU while the other is blocking (waiting until a DMA operation finishes, for example). Since many tasks in an IDE involve disk access, this is likely to happen, too.

Alltogether, it is not an easy decision, but I think that since threads are really cheap, one can be gratious with them in this case. Rather have 2 pools than one, and rather have rather 5 threads each than one or two. Since we are not going to factorize a 300-digit number or something, it should really not matter regarding context switches - most of the time our threads will sleep anyway.
"We should forget about small efficiencies, say about 97% of the time: Premature quotation is the root of public humiliation."

grv575

  • Guest
Re: Revamping Code Completion with little changes
« Reply #37 on: August 23, 2005, 03:37:58 pm »
The term optimality I used previously refer to the article that nobody seems to have read, not a guess of what would be the better way to implement it and a guess on the number of worker threads to use. I'll stop making some comments if nobody read what I say :?

I did read the article.  It's a good reference but very academic.  A much better intro to multiprocessing and thread re performance considerations is http://www.open-mag.com/features/Vol_116/NUMA/NUMA.htm
It gives rules of thumb for this type of stuff (allocate as many intensive, conncurrent threads as you have cpus [or cpu cores]).  Which is the accepted practice.  Analyzing the pdf you linked: the parameter they synthesized and benchmarked at values of 10, 50, 100 give the amount of idle time the thread had..allowing other threads to be scheduled before the time slice is up.  100 is unrealistic, these threads are allocated and slept right away, which explains the linear scaling.  At 50%, these are semi-intensive.  And at 10% (90% of the timeslice is used), we only get improvements by using 2 threads in the pool vs. 1.  This is because the test system had 1 cpu (don't remember if hyperthreading was enabled).  Apparently scheduling was still more efficient by using 2 threads vs. 1 but I suspect it was a hyperthreading cpu (could always check the article ;) ).

Now the issue is that most of the threads in the pool will have something like 3-5% timeslice utilization, except for the compile and link threads which are cpu-intensive.  Therefore, setting each pool to use the # of threads = to the # cpus/cores (accounting for hyperthreading of course) would work very well.  Only the compilation pool would be extremely busy.  And this would also keep the total thread count low as well.  After all how many total cpus do you have (dual core would mean ~6 threads accross 3 pools for example)?

Quote
As I said,
- no need for many pools, it will degrade performance anyway: if one of the pool is already consuming all the processing power, the other pool will have some difficulties to run, and otherwise there will be many thread idle which consume (a little but...) processing power

W/o many pools you run into the task queuing, and UI responsiveness issues that thomas summaried adequately.

Quote
- the number of thread should depend on the workload and workload type, not a fixed value which provide good performance for somebody but bad for others

Does not seem to be the case.  A configuration option where the user can choose how many threads they would like to use is a good idea.  And the default should definately depend on the # of cpus on the system.  Because this is optimal.


Offline rickg22

  • Lives here!
  • ****
  • Posts: 2283
Re: Revamping Code Completion with little changes
« Reply #38 on: August 23, 2005, 05:43:07 pm »
As long as threads are running in the background, there's no consequence that the user could feel (i.e. "my app froze while searching!" ).

The reason parsing kinda freezes codeblocks for a second when you save a file, at least when no wxYield is inserted in the middle of the app, is because it depends on events (which are handled in the MAIN thread) to do the further processing. But if we implement the pooling, there won't be any problem with that.

Anyway, Ifigured out how to avoid the queuing overhead.

You have TWO queues, and a "scheduler thread", dedicated 100% to queuing:

Passive task queue (shared by main thread and scheduling thread) [ pending tasks]
Active task queue (shared by scheduling and worker threads) [about-to-run tasks]

The passive and active task queues use DIFFERENT MUTEXES so they can be handled asynchronously.

The "scheduler thread" would remove the task from the passive queue, and keep a copy in memory for itself (if the queues use pointers rather than objects, much better ;-) ). Then it releases the "passive mutex" so it can start scheduling. This may take more time as the timeslice is very crowded, but since the passive mutex has been released, the GUI won't have to deal with it. All it knows is that the job's being put in the queue now.

Offline rickg22

  • Lives here!
  • ****
  • Posts: 2283
Yiannis, well done! :D
« Reply #39 on: September 03, 2005, 02:19:33 am »
You did an *excellent* work with the Code completion!

I have a question tho...

Why keep the name "ParserThread" if it's not a Thread anymore? It's a bit confusing... but anyway, my kudos to you, in 48 hours you've done wonders with Codeblocks!  8)

Congratulations!
« Last Edit: September 03, 2005, 03:34:38 am by rickg22 »

Offline mandrav

  • Project Leader
  • Administrator
  • Lives here!
  • *****
  • Posts: 4315
    • Code::Blocks IDE
Re: Yiannis, well done! :D
« Reply #40 on: September 03, 2005, 12:03:57 pm »
Why keep the name "ParserThread" if it's not a Thread anymore? It's a bit confusing... but anyway, my kudos to you, in 48 hours you've done wonders with Codeblocks!  8)

Well, technically it is still ran as a thread (although it doesn't descend from wxThread anymore).
But anyway, I would have to rename the files too so it would be too much of a hassle (to preserve CVS history)...

What I like a lot about it is that it is *so* much more responsive now :D (not to mention *much* less resource hungry)
Be patient!
This bug will be fixed soon...

Offline rickg22

  • Lives here!
  • ****
  • Posts: 2283
Re: Revamping Code Completion with little changes
« Reply #41 on: September 05, 2005, 01:42:58 am »
One last question... why the "Please wait while blahblahblah" in the code completion? It's kinda annoying seeing it everytime.

Offline mandrav

  • Project Leader
  • Administrator
  • Lives here!
  • *****
  • Posts: 4315
    • Code::Blocks IDE
Re: Revamping Code Completion with little changes
« Reply #42 on: September 05, 2005, 08:44:58 am »
One last question... why the "Please wait while blahblahblah" in the code completion? It's kinda annoying seeing it everytime.

I will add a configuration option for it.
It was added because while loading/saving cache it might take a while and first-time users better know what's going on and not think something's wrong...
Be patient!
This bug will be fixed soon...

Offline rickg22

  • Lives here!
  • ****
  • Posts: 2283
Re: Revamping Code Completion with little changes
« Reply #43 on: September 05, 2005, 09:04:20 pm »
Hmmmm.... any possibility of doing that caché loading from a threads? (I mean, the parsing is done in threads, why should it be any different?)

Offline mandrav

  • Project Leader
  • Administrator
  • Lives here!
  • *****
  • Posts: 4315
    • Code::Blocks IDE
Re: Revamping Code Completion with little changes
« Reply #44 on: September 05, 2005, 09:39:08 pm »
I can't do everything at the same time :P
Eventually, yes, it will be moved in a thread but now we 're on our way for rc2. This can wait.
Be patient!
This bug will be fixed soon...