Author Topic: Bug Report: [#18755] C::B hangs for 20 seconds while opening large project...  (Read 62951 times)

Offline rickg22

  • Lives here!
  • ****
  • Posts: 2283
Okay, I compiled C::B and started debugging...

Here's some info so far:

projectmanager.cpp:786: EndLoadingProject(result) takes around 40 seconds (in debug mode);
projectmanager.cpp:794: SetProject(result,true) takes from 20 to 30 seconds.

I'll run a detailed walktthrough in the next pass.

EDIT:

Okay, more fine-grained stuff:

projectmanager.cpp: 3101 project->BuildTree(m_pTree, m_TreeRoot, m_TreeVisualState, m_pFileGroups);

This takes most of the running time.

Something puzzled me, however. This builds the tree. Why is it then that in a later stage, in ProjectManager::SetProject (lines 482-483), the tree is rebuilt AGAIN? We just built it! :( (Oh, nevermind. It's the workspace tree, not the project tree)

Anyway, most of the delay is in project->BuildTree. I'll go into more detail tomorrow, it's getting late.
« Last Edit: October 17, 2012, 07:39:02 am by rickg22 »

Offline rickg22

  • Lives here!
  • ****
  • Posts: 2283
Okay, I've been thinking....

If Life gives you lemons, don't make lemonade...  :P

Now, seriously. The following lines seem to make a lot of noise:

Code
        
        wxFileName nodefile = f->file;
        nodefile.MakeRelativeTo(m_CommonTopLevelPath);
        wxString nodetext = nodefile.GetFullPath();
        FileTreeData::FileTreeDataKind folders_kind = FileTreeData::ftdkFolder;

 (I haven't been able to profile them yet, how do you profile that part?)

If I recall correctly, wxFileName does depend on some Operating System stuff to do its calculation. So, here's an idea: Why do we have to recalculate all these relative paths over and over for the files belonging to the same directory?

Why not storing the resulting relative filenames in a hash table? Even better, why not do the calculation first for all the directories, and THEN use the obtained values for the rest of the files? We can know if two files belong to the same directory, right? It would be matter of assigning a numeric id to each directory, and ta-da!

(This is, if most of the time spent is really in the path calculation and not in the actual widget building)


EDIT: Oh, I forgot to post the dll dependencies. Here they are.
Code
From        To          Syms Read   Shared Object Library
0x77b01000  0x77c65d1c  Yes (*)     C:\Windows\system32\ntdll.dll
0x75561000  0x7565bd58  Yes (*)     C:\Windows\syswow64\kernel32.dll
0x76ca1000  0x76ce6a18  Yes (*)     C:\Windows\syswow64\KernelBase.dll
0x76f81000  0x7702b2c4  Yes (*)     C:\Windows\syswow64\msvcrt.dll
0x75d11000  0x76959898  Yes (*)     C:\Windows\syswow64\shell32.dll
0x75831000  0x75886b60  Yes (*)     C:\Windows\syswow64\shlwapi.dll
0x75bb1000  0x75c2292c  Yes (*)     C:\Windows\syswow64\gdi32.dll
0x758c1000  0x759a4198  Yes (*)     C:\Windows\syswow64\user32.dll
0x75ae1000  0x75b7f04c  Yes (*)     C:\Windows\syswow64\advapi32.dll
0x76b21000  0x76b38ed8  Yes (*)     C:\Windows\SysWOW64\sechost.dll
0x77151000  0x77225e04  Yes (*)     C:\Windows\syswow64\rpcrt4.dll
0x751e1000  0x752221f0  Yes (*)     C:\Windows\syswow64\sspicli.dll
0x751d1000  0x751db474  Yes (*)     C:\Windows\syswow64\cryptbase.dll
0x75ba1000  0x75ba92f8  Yes (*)     C:\Windows\syswow64\lpk.dll
0x75c41000  0x75cdc9fc  Yes (*)     C:\Windows\syswow64\usp10.dll
0x751a1000  0x751a40f0  Yes (*)     C:\Windows\SysWOW64\shfolder.dll
0x6e941000  0x6e96373c  Yes (*)     C:\MinGW\bin\libgcc_s_dw2-1.dll
0x6fc41000  0x6fd3495c  Yes (*)     C:\MinGW\bin\libstdc++-6.dll
0x62701000  0x62ddc6c0  Yes         C:\projects\wxWidgets\wxMSW-2.8.12\lib\gcc_dll\wxmsw28u_gcc_custom.dll
0x72b31000  0x72ccd18c  Yes (*)     C:\Windows\WinSxS\x86_microsoft.windows.common-controls_6595b64144ccf1df_6.0.7601.17514_none_41e6975e2bd6f2b2\comctl32.dll
0x76b91000  0x76c0a39c  Yes (*)     C:\Windows\syswow64\comdlg32.dll
0x76cf1000  0x76e4b0cc  Yes (*)     C:\Windows\syswow64\ole32.dll
0x756b1000  0x7573e644  Yes (*)     C:\Windows\syswow64\oleaut32.dll
0x73a71000  0x73aa1264  Yes (*)     C:\Windows\SysWOW64\winmm.dll
0x736f1000  0x737407a0  Yes (*)     C:\Windows\SysWOW64\winspool.drv
0x73951000  0x73956108  Yes (*)     C:\Windows\SysWOW64\wsock32.dll
0x75671000  0x756a4784  Yes (*)     C:\Windows\syswow64\ws2_32.dll
0x77ad1000  0x77ad5058  Yes (*)     C:\Windows\syswow64\nsi.dll
0x00de1000  0x01336718  Yes         c:\projects\cb\src\devel\codeblocks.dll
0x6e441000  0x6e4f93c0  Yes         c:\projects\cb\src\devel\wxpropgrid.dll
0x75361000  0x753a1ce0  Yes (*)     C:\Windows\SysWOW64\imm32.dll
0x76eb1000  0x76f7bebc  Yes (*)     C:\Windows\syswow64\msctf.dll
0x73811000  0x73872a7c  Yes (*)     C:\Windows\SysWOW64\uxtheme.dll
0x6fe71000  0x6fe8299c  Yes (*)     C:\Windows\SysWOW64\dwmapi.dll
0x736d1000  0x736e57b2  Yes (*)     C:\Windows\SysWOW64\cryptsp.dll
0x73691000  0x736ca244  Yes (*)     C:\Windows\SysWOW64\rsaenh.dll
0x72811000  0x7281d7ac  Yes (*)     C:\Windows\SysWOW64\RpcRtRemote.dll
0x743d1000  0x74463f78  Yes (*)     C:\Windows\SysWOW64\msftedit.dll
0x6bd81000  0x6bd99ba0  Yes         c:\projects\cb\src\devel\share\codeblocks\plugins\abbreviations.dll
0x712c1000  0x71312cb4  Yes         c:\projects\cb\src\devel\share\codeblocks\plugins\astyle.dll
0x64381000  0x64395a68  Yes         c:\projects\cb\src\devel\share\codeblocks\plugins\autosave.dll
0x6af01000  0x6af1df1c  Yes         c:\projects\cb\src\devel\share\codeblocks\plugins\classwizard.dll
0x65e81000  0x65fa41f0  Yes         c:\projects\cb\src\devel\share\codeblocks\plugins\codecompletion.dll
0x64b01000  0x64c6ca48  Yes         c:\projects\cb\src\devel\share\codeblocks\plugins\compiler.dll
0x6d881000  0x6d8fcf18  Yes         c:\projects\cb\src\devel\share\codeblocks\plugins\debugger.dll
0x649c1000  0x649dc2a8  Yes         c:\projects\cb\src\devel\share\codeblocks\plugins\defaultmimehandler.dll
0x69041000  0x690515e4  Yes         c:\projects\cb\src\devel\share\codeblocks\plugins\openfileslist.dll
0x70501000  0x705335f8  Yes         c:\projects\cb\src\devel\share\codeblocks\plugins\projectsimporter.dll
0x63c01000  0x63c4a6e0  Yes         c:\projects\cb\src\devel\share\codeblocks\plugins\scriptedwizard.dll
0x6bac1000  0x6baecacc  Yes         c:\projects\cb\src\devel\share\codeblocks\plugins\todo.dll
0x62301000  0x6230d474  Yes         c:\projects\cb\src\devel\share\codeblocks\plugins\xpmanifest.dll
0x739f1000  0x73a3b464  Yes (*)     C:\Windows\SysWOW64\apphelp.dll
0x72b21000  0x72b2404c  Yes (*)     C:\Windows\SysWOW64\msimg32.dll
(*): Shared library is missing debugging information.
« Last Edit: October 17, 2012, 07:55:12 am by rickg22 »

Offline MortenMacFly

  • Administrator
  • Lives here!
  • *****
  • Posts: 9694
(I haven't been able to profile them yet, how do you profile that part?)
wxStopWatch with a DebugLog output, maybe?
Compiler logging: Settings->Compiler & Debugger->tab "Other"->Compiler logging="Full command line"
C::B Manual: https://www.codeblocks.org/docs/main_codeblocks_en.html
C::B FAQ: https://wiki.codeblocks.org/index.php?title=FAQ

Offline Jenna

  • Administrator
  • Lives here!
  • *****
  • Posts: 7255
On linux it takes less than a second for a project with more then 7000 files (one of my wxWidgets test-projects).

But I noticed one thing, that can increase the loading time a lot:
At first run (when opening a project or workspace), we always call BuildTree() and RebuildTree() for the project.
RebuildTree() deletes the content of the tree and rebuilds it from scratch.
I think this should be avoided if possible.

I have no time to look into it at the moment, maybe later this week, if I am back home.

Offline dmoore

  • Developer
  • Lives here!
  • *****
  • Posts: 1576
Using wxStopWatch I can confirm that almost all of the opening time is spent on wxFileName::MakeRelativeTo. Rebuilding the tree control isn't all that expensive.

Offline rickg22

  • Lives here!
  • ****
  • Posts: 2283
Using wxStopWatch I can confirm that almost all of the opening time is spent on wxFileName::MakeRelativeTo. Rebuilding the tree control isn't all that expensive.

Mind showing us the cumulative stats?

Anyway, I got an idea that might just work. Here's the pseudocode.

Code
1. wxString lastparentpath = "", curparentpath = "", lastrelativepath = "", currelativepath = "", currelativefilename = "", DS = "/"; // DS is the directory separator, adapt to OS needs.
2. for each of the filenames to be processed as curfilename: {
    2.1. curparentpath = obtain_path_only(curfilename); // get the full path without the filename
    2.2. basefilename = obtain_filename_only(curfilename); // filename only, plus extension
    2.3. if(curparentpath == lastparentpath && lastparentpath !== "") {
          currelativepath = lastrelativepath;
          currelativefilename = currelativepath + DS + basefilename;
          }
    2.4. else {
             currelativefilename = MakeRelative(filename,commontopprojectdirectory);
             lastparentpath = curparentpath;
             lastrelativepath = obtain_path_only(currelativefilename);
          }
    2.5. Do the rest of the tree node adding here.
}

This way, MakeRelative is only called for the first file of a given directory. The rest of the relative filenames are just calculated using a concatenation.

As for the obtain_path_only and obtain_filename_only functions, they are easily implemented using a reverse searching for the "/" and "\" characters on each filename, and splitting the string in two at that position. In fact, we could use a single function that splits the filename into path and basename using only one search.
« Last Edit: October 17, 2012, 04:53:01 pm by rickg22 »

Offline dmoore

  • Developer
  • Lives here!
  • *****
  • Posts: 1576
@rick: The MakeRelativeTo calls were taking up >90% of the ProjectManager::RebuildTree (i.e. at least 1800 of about 2000 ms)

EDIT: Here's an example:
Code
Done loading project in 233ms
BuildTree
MakeRelativeTo time 1807 ms
Total time 1905 ms
BuildTree
MakeRelativeTo time 1793 ms
Total time 1890 ms


Not sure the more complicated patch is needed because ProjectFile already has a member relativeToCommonTopLevelPath. This is initialized when the project is first opened, but it does not use MakeRelativeTo. Instead it does:

Code
            pf->relativeToCommonTopLevelPath = fullFilename.Right(fullFilename.Length() - m_CommonTopLevelPath.Length());

Which I'm sure breaks in the corner case where files are split across volumes.

But anyway, if we use that member we have a simple patch:

Code
Index: C:/Users/damienm/Documents/damien/Source/codeblockssrc/trunk/src/sdk/cbproject.cpp
===================================================================
--- C:/Users/damienm/Documents/damien/Source/codeblockssrc/trunk/src/sdk/cbproject.cpp (revision 8456)
+++ C:/Users/damienm/Documents/damien/Source/codeblockssrc/trunk/src/sdk/cbproject.cpp (working copy)
@@ -907,9 +907,7 @@
         ftd->SetProjectFile(f);
         ftd->SetFolder(f->file.GetFullPath());
 
-        wxFileName nodefile = f->file;
-        nodefile.MakeRelativeTo(m_CommonTopLevelPath);
-        wxString nodetext = nodefile.GetFullPath();
+        wxString nodetext = f->relativeToCommonTopLevelPath;
         FileTreeData::FileTreeDataKind folders_kind = FileTreeData::ftdkFolder;
 
         // by default, the parent node is the project node (in case of no grouping, no virtual folders)

If we want a patch that is robust to files split across volumes it should be a fix for the way relativeToCommonTopLevelPath is initialized.

There is still a small delay from the project loader (200ms), but that may just be parsing the xml and checking that files exist and, thus, not much can be done about.
« Last Edit: October 17, 2012, 07:29:47 pm by dmoore »

Offline Jenna

  • Administrator
  • Lives here!
  • *****
  • Posts: 7255
The commonTopLevelPath might hav echanged since initializing relativeToCommonTopLevelPath.
This can happen every time a new file is added.
One way to handle this would be to reset this variable every time the commonTopLevelPath has changed.
Another would be to use a function instead, which does exactly what happens when it is initialized.
This should not be too expensive (just a wxString.Right() ).
A different volume on windows could be handled here also, most likely.
Should be doable without the use of wxFileName in case fullFileName also keeps the volume of a file, if not we might introduce another member, that keeps it.

Offline dmoore

  • Developer
  • Lives here!
  • *****
  • Posts: 1576
we've been here before  :P

http://forums.codeblocks.org/index.php/topic,6288.0.html

svn

But yes, I agree with you Jens that probably the best way to go is to replace that member with a function that does the calculation on the fly.
« Last Edit: October 17, 2012, 10:11:49 pm by dmoore »

Offline Jenna

  • Administrator
  • Lives here!
  • *****
  • Posts: 7255
@dmoore:
I attach a patch, that combines your patch and a slightly enhanced patch of MortenMacFly (see http://forums.codeblocks.org/index.php/topic,16947.msg115392.html#msg115392).

Soe measurements on linux with wxWidgets trunk (> 10.000 files):

without the patch:
Quote
/home/jens/codeblocks-build/codeblocks.git/src/sdk/cbproject.cpp::void cbProject::BuildTree(cbTreeCtrl*, const wxTreeItemId&, int, FilesGroupsAndMasks*):1008  took : 1406 ms
/home/jens/codeblocks-build/codeblocks.git/src/sdk/cbproject.cpp::void cbProject::BuildTree(cbTreeCtrl*, const wxTreeItemId&, int, FilesGroupsAndMasks*):1008  took : 1432 ms

CalculateCommonTopLevelPath() took 275 ms
iterating through all 10677 files took 1030 ms
iterating through all 10677 files took 1066 ms

with the patch:

Quote
/home/jens/codeblocks-build/codeblocks.git/src/sdk/cbproject.cpp::void cbProject::BuildTree(cbTreeCtrl*, const wxTreeItemId&, int, FilesGroupsAndMasks*):1002  took : 1123 ms
/home/jens/codeblocks-build/codeblocks.git/src/sdk/cbproject.cpp::void cbProject::BuildTree(cbTreeCtrl*, const wxTreeItemId&, int, FilesGroupsAndMasks*):1002  took : 1132 ms

CalculateCommonTopLevelPath() took 275 ms
iterating through all 10677 files took 753 ms
iterating through all 10677 files took 751 ms

Not much in absolute time, but about 25 to 30 %.

I will test it on windows soon, also with mixed volume projects.

Offline dmoore

  • Developer
  • Lives here!
  • *****
  • Posts: 1576
Does this do anything to ensure relativeToCommonTopLevelPath is kept up to date? (Or is that already taken care of?)

Offline Jenna

  • Administrator
  • Lives here!
  • *****
  • Posts: 7255
Does this do anything to ensure relativeToCommonTopLevelPath is kept up to date? (Or is that already taken care of?)
I did not add any call to it, but if it is not kept up to date, it would be another bug.
Currently it's called in
  • cbProject::Open()
  • cbProject::AddFile()
  • ProjectFile::Rename()
  • ProjectManager::OnRemoveFileFromProject()

Offline dmoore

  • Developer
  • Lives here!
  • *****
  • Posts: 1576
Not much in absolute time, but about 25 to 30 %.

I will test it on windows soon, also with mixed volume projects.

On windows without the volume stuff, I got an improvement from about 4000 to about 400 ms to open the C::B project. (Loading and the two BuildTree calls) Will test your patch tomorrow if I have time.
« Last Edit: October 18, 2012, 01:36:28 am by dmoore »

Offline rickg22

  • Lives here!
  • ****
  • Posts: 2283
OK, I applied the patch (manually), and the project opens INSTANTLY! :D

Unfortunately, none of the files can be opened :P maybe I applied it wrong?  ???

EDIT: Oops, the extensions handler plugin was disabled :P

OK, confirmed for me. Congratulations, Jens! With your patch, loading time goes down from 22 seconds to ZERROW.

Apply the patch, Scotty! Er, I mean, Jens.

(Oh, and after this is patched, remember to do something about the double loading)
« Last Edit: October 18, 2012, 05:35:00 am by rickg22 »

Offline Jenna

  • Administrator
  • Lives here!
  • *****
  • Posts: 7255
To make it clear, the real solution comes from dmoore, MortenMacFly fixes a problem when calculating the common TopLevelPath with projectfiles on different volumes, I just added a small part to skip files on different volumes when calculating the common TopLevelPath.

The real speedup comes from dmoores patch.

Now let's test it (especially on win with files spread over two and more volumes) and if all goes well it can be committed.