Okay, I've been thinking....
If Life gives you lemons, don't make lemonade... :P
Now, seriously. The following lines seem to make a lot of noise:
wxFileName nodefile = f->file;
nodefile.MakeRelativeTo(m_CommonTopLevelPath);
wxString nodetext = nodefile.GetFullPath();
FileTreeData::FileTreeDataKind folders_kind = FileTreeData::ftdkFolder;
(I haven't been able to profile them yet, how do you profile that part?)
If I recall correctly, wxFileName does depend on some Operating System stuff to do its calculation. So, here's an idea: Why do we have to recalculate all these relative paths over and over for the files belonging to the same directory?
Why not storing the resulting relative filenames in a hash table? Even better, why not do the calculation first for all the directories, and THEN use the obtained values for the rest of the files? We can know if two files belong to the same directory, right? It would be matter of assigning a numeric id to each directory, and ta-da!
(This is, if most of the time spent is really in the path calculation and not in the actual widget building)
EDIT: Oh, I forgot to post the dll dependencies. Here they are.
From To Syms Read Shared Object Library
0x77b01000 0x77c65d1c Yes (*) C:\Windows\system32\ntdll.dll
0x75561000 0x7565bd58 Yes (*) C:\Windows\syswow64\kernel32.dll
0x76ca1000 0x76ce6a18 Yes (*) C:\Windows\syswow64\KernelBase.dll
0x76f81000 0x7702b2c4 Yes (*) C:\Windows\syswow64\msvcrt.dll
0x75d11000 0x76959898 Yes (*) C:\Windows\syswow64\shell32.dll
0x75831000 0x75886b60 Yes (*) C:\Windows\syswow64\shlwapi.dll
0x75bb1000 0x75c2292c Yes (*) C:\Windows\syswow64\gdi32.dll
0x758c1000 0x759a4198 Yes (*) C:\Windows\syswow64\user32.dll
0x75ae1000 0x75b7f04c Yes (*) C:\Windows\syswow64\advapi32.dll
0x76b21000 0x76b38ed8 Yes (*) C:\Windows\SysWOW64\sechost.dll
0x77151000 0x77225e04 Yes (*) C:\Windows\syswow64\rpcrt4.dll
0x751e1000 0x752221f0 Yes (*) C:\Windows\syswow64\sspicli.dll
0x751d1000 0x751db474 Yes (*) C:\Windows\syswow64\cryptbase.dll
0x75ba1000 0x75ba92f8 Yes (*) C:\Windows\syswow64\lpk.dll
0x75c41000 0x75cdc9fc Yes (*) C:\Windows\syswow64\usp10.dll
0x751a1000 0x751a40f0 Yes (*) C:\Windows\SysWOW64\shfolder.dll
0x6e941000 0x6e96373c Yes (*) C:\MinGW\bin\libgcc_s_dw2-1.dll
0x6fc41000 0x6fd3495c Yes (*) C:\MinGW\bin\libstdc++-6.dll
0x62701000 0x62ddc6c0 Yes C:\projects\wxWidgets\wxMSW-2.8.12\lib\gcc_dll\wxmsw28u_gcc_custom.dll
0x72b31000 0x72ccd18c Yes (*) C:\Windows\WinSxS\x86_microsoft.windows.common-controls_6595b64144ccf1df_6.0.7601.17514_none_41e6975e2bd6f2b2\comctl32.dll
0x76b91000 0x76c0a39c Yes (*) C:\Windows\syswow64\comdlg32.dll
0x76cf1000 0x76e4b0cc Yes (*) C:\Windows\syswow64\ole32.dll
0x756b1000 0x7573e644 Yes (*) C:\Windows\syswow64\oleaut32.dll
0x73a71000 0x73aa1264 Yes (*) C:\Windows\SysWOW64\winmm.dll
0x736f1000 0x737407a0 Yes (*) C:\Windows\SysWOW64\winspool.drv
0x73951000 0x73956108 Yes (*) C:\Windows\SysWOW64\wsock32.dll
0x75671000 0x756a4784 Yes (*) C:\Windows\syswow64\ws2_32.dll
0x77ad1000 0x77ad5058 Yes (*) C:\Windows\syswow64\nsi.dll
0x00de1000 0x01336718 Yes c:\projects\cb\src\devel\codeblocks.dll
0x6e441000 0x6e4f93c0 Yes c:\projects\cb\src\devel\wxpropgrid.dll
0x75361000 0x753a1ce0 Yes (*) C:\Windows\SysWOW64\imm32.dll
0x76eb1000 0x76f7bebc Yes (*) C:\Windows\syswow64\msctf.dll
0x73811000 0x73872a7c Yes (*) C:\Windows\SysWOW64\uxtheme.dll
0x6fe71000 0x6fe8299c Yes (*) C:\Windows\SysWOW64\dwmapi.dll
0x736d1000 0x736e57b2 Yes (*) C:\Windows\SysWOW64\cryptsp.dll
0x73691000 0x736ca244 Yes (*) C:\Windows\SysWOW64\rsaenh.dll
0x72811000 0x7281d7ac Yes (*) C:\Windows\SysWOW64\RpcRtRemote.dll
0x743d1000 0x74463f78 Yes (*) C:\Windows\SysWOW64\msftedit.dll
0x6bd81000 0x6bd99ba0 Yes c:\projects\cb\src\devel\share\codeblocks\plugins\abbreviations.dll
0x712c1000 0x71312cb4 Yes c:\projects\cb\src\devel\share\codeblocks\plugins\astyle.dll
0x64381000 0x64395a68 Yes c:\projects\cb\src\devel\share\codeblocks\plugins\autosave.dll
0x6af01000 0x6af1df1c Yes c:\projects\cb\src\devel\share\codeblocks\plugins\classwizard.dll
0x65e81000 0x65fa41f0 Yes c:\projects\cb\src\devel\share\codeblocks\plugins\codecompletion.dll
0x64b01000 0x64c6ca48 Yes c:\projects\cb\src\devel\share\codeblocks\plugins\compiler.dll
0x6d881000 0x6d8fcf18 Yes c:\projects\cb\src\devel\share\codeblocks\plugins\debugger.dll
0x649c1000 0x649dc2a8 Yes c:\projects\cb\src\devel\share\codeblocks\plugins\defaultmimehandler.dll
0x69041000 0x690515e4 Yes c:\projects\cb\src\devel\share\codeblocks\plugins\openfileslist.dll
0x70501000 0x705335f8 Yes c:\projects\cb\src\devel\share\codeblocks\plugins\projectsimporter.dll
0x63c01000 0x63c4a6e0 Yes c:\projects\cb\src\devel\share\codeblocks\plugins\scriptedwizard.dll
0x6bac1000 0x6baecacc Yes c:\projects\cb\src\devel\share\codeblocks\plugins\todo.dll
0x62301000 0x6230d474 Yes c:\projects\cb\src\devel\share\codeblocks\plugins\xpmanifest.dll
0x739f1000 0x73a3b464 Yes (*) C:\Windows\SysWOW64\apphelp.dll
0x72b21000 0x72b2404c Yes (*) C:\Windows\SysWOW64\msimg32.dll
(*): Shared library is missing debugging information.
Using wxStopWatch I can confirm that almost all of the opening time is spent on wxFileName::MakeRelativeTo. Rebuilding the tree control isn't all that expensive.
Mind showing us the cumulative stats?
Anyway, I got an idea that might just work. Here's the pseudocode.
1. wxString lastparentpath = "", curparentpath = "", lastrelativepath = "", currelativepath = "", currelativefilename = "", DS = "/"; // DS is the directory separator, adapt to OS needs.
2. for each of the filenames to be processed as curfilename: {
2.1. curparentpath = obtain_path_only(curfilename); // get the full path without the filename
2.2. basefilename = obtain_filename_only(curfilename); // filename only, plus extension
2.3. if(curparentpath == lastparentpath && lastparentpath !== "") {
currelativepath = lastrelativepath;
currelativefilename = currelativepath + DS + basefilename;
}
2.4. else {
currelativefilename = MakeRelative(filename,commontopprojectdirectory);
lastparentpath = curparentpath;
lastrelativepath = obtain_path_only(currelativefilename);
}
2.5. Do the rest of the tree node adding here.
}
This way, MakeRelative is only called for the first file of a given directory. The rest of the relative filenames are just calculated using a concatenation.
As for the obtain_path_only and obtain_filename_only functions, they are easily implemented using a reverse searching for the "/" and "\" characters on each filename, and splitting the string in two at that position. In fact, we could use a single function that splits the filename into path and basename using only one search.
@rick: The MakeRelativeTo calls were taking up >90% of the ProjectManager::RebuildTree (i.e. at least 1800 of about 2000 ms)
EDIT: Here's an example:
Done loading project in 233ms
BuildTree
MakeRelativeTo time 1807 ms
Total time 1905 ms
BuildTree
MakeRelativeTo time 1793 ms
Total time 1890 ms
Not sure the more complicated patch is needed because ProjectFile already has a member relativeToCommonTopLevelPath. This is initialized when the project is first opened, but it does not use MakeRelativeTo. Instead it does:
pf->relativeToCommonTopLevelPath = fullFilename.Right(fullFilename.Length() - m_CommonTopLevelPath.Length());
Which I'm sure breaks in the corner case where files are split across volumes.
But anyway, if we use that member we have a simple patch:
Index: C:/Users/damienm/Documents/damien/Source/codeblockssrc/trunk/src/sdk/cbproject.cpp
===================================================================
--- C:/Users/damienm/Documents/damien/Source/codeblockssrc/trunk/src/sdk/cbproject.cpp (revision 8456)
+++ C:/Users/damienm/Documents/damien/Source/codeblockssrc/trunk/src/sdk/cbproject.cpp (working copy)
@@ -907,9 +907,7 @@
ftd->SetProjectFile(f);
ftd->SetFolder(f->file.GetFullPath());
- wxFileName nodefile = f->file;
- nodefile.MakeRelativeTo(m_CommonTopLevelPath);
- wxString nodetext = nodefile.GetFullPath();
+ wxString nodetext = f->relativeToCommonTopLevelPath;
FileTreeData::FileTreeDataKind folders_kind = FileTreeData::ftdkFolder;
// by default, the parent node is the project node (in case of no grouping, no virtual folders)
If we want a patch that is robust to files split across volumes it should be a fix for the way relativeToCommonTopLevelPath is initialized.
There is still a small delay from the project loader (200ms), but that may just be parsing the xml and checking that files exist and, thus, not much can be done about.
Now let's test it (especially on win with files spread over two and more volumes)
I can do it later on. What needs to be tested:
- project / project layout file saving / loading with files on same and different volumes
- project / project layout file saving / loading with files on UNC path's
- especially saving project files below and above a certain source folder
- compilation process and UI behaviour where this path's are used
...anything else?
I haven't tested everything, but I made up a one file ConsoleProject, added another file on another volume to a new target then compiled and ran that target successfully.
-------------- Build: ExternalFileProject in ConsoleProject (compiler: GNU GCC Compiler)---------------
mingw32-g++.exe -Wall -fexceptions -O2 -c t:\CProjects\ConsoleProject\main.cpp -o obj\Release\t\CProjects\ConsoleProject\main.o
mingw32-g++.exe -o bin\Release\ConsoleProject.exe obj\Release\t\CProjects\ConsoleProject\main.o -s
Output size is 478.50 KB
Process terminated with status 0 (0 minutes, 2 seconds)
0 errors, 0 warnings (0 minutes, 2 seconds)
Also did the same for adding a file just one directory above the project and no problems there either.
No problems loading/saving the project. The project tree shows the file on a separate volume nested appropriately under Sources/t/CProjects/ConsoleProject/main.cpp, and the different project tree options work as expected. For the file one directory above, the basedir of the tree adjusts appropriately.
So far so good.
...here it comes, the final patch that works for all test cases (including different project tree viewer setups, btw):
The bottleneck: This fast method to obtain the path of a project file relative to the common top level path:
// f->relativeToCommonTopLevelPath = fileName.Right(fileName.Length() - m_CommonTopLevelPath.Length());
...does not work. I've commented why it does not work an use now also a wxFileName call. This is more expensive, but safe. I would love to see time measuring again, if it doesn't cost too much.
But I believe with this patch, we finally solved the actual issue with UNC, relative and path's on different drives all together once and for all.
Ps.: If you use a good diff-viewer, you can disable white space changes, as there are some not related to the actual fix. Sorry for that, I "a-style"d some portions for better understanding of what's going on.
+ if ( (!prjHasUNCName && fileHasUNCName)
+ || ( prjHasUNCName && !fileHasUNCName) )
Maybe I'm a little pedantic here, but couldn't that be written as...
if(prjHasUNCName != fileHasUNCName)
?
Anyway, here's the core of your patch:
@@ -358,12 +380,37 @@
for (FilesList::iterator it = m_Files.begin(); it != m_Files.end(); ++it)
{
ProjectFile* f = (*it);
-
if (!f)
continue;
wxString fileName = f->file.GetFullPath();
- f->relativeToCommonTopLevelPath = fileName.Right(fileName.Length() - m_CommonTopLevelPath.Length());
+ fileHasUNCName = fileName.StartsWith(_T("\\\\"));
+
+ if ( (prjHasUNCName && fileHasUNCName)
+ || ( !prjHasUNCName
+ && !fileHasUNCName
+ && vol.IsSameAs(f->file.GetVolume()) ) )
+ {
+ wxFileName relFileCTLP(f->file);
+ relFileCTLP.MakeRelativeTo( m_CommonTopLevelPath );
+ wxFileName relFileBase(f->file);
+ relFileBase.MakeRelativeTo( GetBasePath() );
+
+ // The commented (old) method to obtain the relativeToCommonTopLevelPath is fast, but does *not* work, if you save
+ // the project on a different drive in a sub-folder of an existing source file on that (different) drive:
+ // I.e.: Project on C:\Folder\Project.cbp has file C:\Folder\SubFolder\foo.cpp and D:\Folder\bar.cpp
+ // Saved the project under D:\Folder\SubFolder\ProjectNew.cbp would cause a wrong computation of bar.cpp otherwise!!!
+// f->relativeToCommonTopLevelPath = fileName.Right(fileName.Length() - m_CommonTopLevelPath.Length());
+ // Using wxFileName instead, although its costly:
+ f->relativeToCommonTopLevelPath = relFileCTLP.GetFullPath();
+ f->relativeFilename = relFileBase.GetFullPath();
+ }
+ else
+ {
+ f->relativeToCommonTopLevelPath = fileName;
+ f->relativeFilename = fileName;
+ }
+
It looks pretty good to me; It uses wxFilename only when necessary. Anyway, I'm thinking that these two lines
+ f->relativeToCommonTopLevelPath = relFileCTLP.GetFullPath();
+ f->relativeFilename = relFileBase.GetFullPath();
could be optimized further by adding a path "cache" and comparing the file's relative pathname (the path without the filename) with the last file's pathname, and using the last calculated relativeToCommonTopLevelPath and relativeFilename instead of just calling wxFilename right away. This was my original proposal, btw, but all the other optimizations submitted here seem pretty rad :)
EDIT: OH, WAIT A MINUTE!!! I re-read that patch again...
+ if ( (prjHasUNCName && fileHasUNCName)
+ || ( !prjHasUNCName
+ && !fileHasUNCName
+ && vol.IsSameAs(f->file.GetVolume()) ) )
+ {
This is the precondition for the fast calculation, right? And then you're just undoing all the optimizations here! If the only failing case happens when the old project's drive is different than the current one, why not add a condition that specifically says so?
if(prjHasChangedDrives) { // prjHasChangedDrives is the real issue here, isn't it?
// If it has either changed windows drives or UNC network drives, we should use the full method;
// Otherwise, the fast one.
// Calculating prjHasChangedDrives should be trivial, anyway.
wxFileName relFileCTLP(f->file);
relFileCTLP.MakeRelativeTo( m_CommonTopLevelPath );
wxFileName relFileBase(f->file);
relFileBase.MakeRelativeTo( GetBasePath() );
f->relativeToCommonTopLevelPath = relFileCTLP.GetFullPath();
f->relativeFilename = relFileBase.GetFullPath();
} else {
f->relativeToCommonTopLevelPath = fileName.Right(fileName.Length() - m_CommonTopLevelPath.Length());
}
Another possibility:
int fn_length = fileName.Length();
int prj_length = m_CommonTopLevelPath.Length();
bool is_file_above_project = fn_length <= m_CommonTopLevelPath.Length;
// This is an approximation; The correct approach would be not to compare the filenames, but the pathnames without filenames.
if(is_file_above_project) { // use wxFilename here because the file is on a directory above the project
wxFileName relFileCTLP(f->file);
relFileCTLP.MakeRelativeTo( m_CommonTopLevelPath );
wxFileName relFileBase(f->file);
relFileBase.MakeRelativeTo( GetBasePath() );
f->relativeToCommonTopLevelPath = relFileCTLP.GetFullPath();
f->relativeFilename = relFileBase.GetFullPath();
} else {
// Also, shouldn't we compare the part of the path we're removing?
f->relativeToCommonTopLevelPath = fileName.Right(fileName.Length() - m_CommonTopLevelPath.Length());
}
Approach 2:
int len_diff = fileName.Length() - m_CommonTopLevelPath.Length();
bool path_mismatch = (len_diff < 0);
if(!path_mismatch) {
// If there's a wxString function that compares from right to left, we should use it here.
path_mismatch = fileName.Left(m_CommonTopLevelPath.Length()) != m_CommonTopLevelPath;
// NOTE: I'm assuming that m_CommonTopLevelPath ends in a directory separator; otherwise,
// we could have problems if m_CommonTopLevelPath is something like "MyDirectory"
// and the file is something like "MyDirectoryFilename".
}
if(path_mismatch) {
wxFileName relFileCTLP(f->file);
relFileCTLP.MakeRelativeTo( m_CommonTopLevelPath );
wxFileName relFileBase(f->file);
relFileBase.MakeRelativeTo( GetBasePath() );
f->relativeToCommonTopLevelPath = relFileCTLP.GetFullPath();
f->relativeFilename = relFileBase.GetFullPath();
} else {
f->relativeToCommonTopLevelPath = fileName.Right(fileName.Length() - m_CommonTopLevelPath.Length());
}
GOOD NEWS EVERYONE!
I'm very close to fixing the bug!
I made a relative path cache: If a file is in an already-tested directory, its relative path is calculated based on its directory's relative path. When a file is not found in the cache, its paths are calculated using MortenMacFly's slow-but-sure path calculation via wxFileName.
Here are the preliminary results:
WITHOUT THE PATH CACHE:
C:\projects\cb\src\sdk\cbproject.cpp::void cbProject::CalculateCommonTopLevelPath():394 took : 50 ms
Project's common toplevel path: C:\htdocs\SIG\SGPG\
C:\projects\cb\src\sdk\cbproject.cpp::void cbProject::CalculateCommonTopLevelPath():460 took : 13496 ms
C:\projects\cb\src\sdk\cbproject.cpp::void cbProject::CalculateCommonTopLevelPath():461: Num Files: 1927; Num Cache Misses: 1927
WITH THE PATH CACHE:
C:\projects\cb\src\sdk\cbproject.cpp::void cbProject::CalculateCommonTopLevelPath():404 took : 55 ms
Project's common toplevel path: C:\htdocs\SIG\SGPG\
C:\projects\cb\src\sdk\cbproject.cpp::void cbProject::CalculateCommonTopLevelPath():464 took : 2437 ms
C:\projects\cb\src\sdk\cbproject.cpp::void cbProject::CalculateCommonTopLevelPath():465: Num Files: 1927; Num Cache Misses: 318
Now, since the files in the project are not in order (because we're using a HashMap and not an ordered map), many directories still trigger cache misses. Even so, execution time has disminished by 80%.
I was able to diminish the load time by a further 2% by aditionally checking the parent directory in the cache; Then again, the randomized file order doesn't help. So I'm rewriting the algorithm by using a cleaned-up recursive version of the cache so that we'll have to use wxFileName's functions only twice: Once for the working directory, and once for the project's common top level path; the rest will be mere string comparisons and concatenations. I'll keep you updated when I get the algorithm working.
UPDATE:
I finished polishing the Path Cache. There are some edge cases I haven't tested, mainly at root directories or the top directories of other drives, both for filenames and for the project file.
For standard projects with all files under the project file's directory, things work Okay.
So... ladies and gentlemen... drum rolls, please...
Loading project files...
1927 files loaded
Done loading project in 2042ms
Project's base path: C:\htdocs\SIG\SGPG\
C:\projects\cb\src\sdk\cbproject.cpp::void cbProject::CalculateCommonTopLevelPath():530 took : 50 ms
Project's common toplevel path: C:\htdocs\SIG\SGPG\
C:\projects\cb\src\sdk\cbproject.cpp::void cbProject::CalculateCommonTopLevelPath():570 took : 120 ms
C:\projects\cb\src\sdk\cbproject.cpp::void cbProject::CalculateCommonTopLevelPath():571: Num Files: 1927; Num Cache Misses: 0
Ta-da! 8)
I'm attaching the patch. It needs some styling (you know, braces, tabs, etc), but I hope you like it.
Attached is an untested refinement to Morten's patch that uses the old fast method in the more normal case where files are below the project dir, and uses MakeRelativeTo otherwise. This could be combined with Rick's patch to reduce the amount of cache usage.
Here's the changed code:
if ( (prjHasUNCName && fileHasUNCName)
|| ( !prjHasUNCName
&& !fileHasUNCName
&& vol.IsSameAs(f->file.GetVolume()) ) )
{
if (!fileName.StartsWith(m_CommonTopLevelPath))
{
wxFileName relFileCTLP(f->file);
relFileCTLP.MakeRelativeTo( m_CommonTopLevelPath );
f->relativeToCommonTopLevelPath = relFileCTLP.GetFullPath();
}
else
f->relativeToCommonTopLevelPath = fileName.Right(fileName.Length() - m_CommonTopLevelPath.Length());
if (!fileName.StartsWith(GetBasePath()))
{
wxFileName relFileBase(f->file);
relFileBase.MakeRelativeTo( GetBasePath() );
f->relativeFilename = relFileBase.GetFullPath();
}
else
f->relativeFilename = fileName.Right(fileName.Length() - GetBasePath().Length());
}
I am a little bit confused by this:
+ // The commented (old) method to obtain the relativeToCommonTopLevelPath is fast, but does *not* work, if you save
+ // the project on a different drive in a sub-folder of an existing source file on that (different) drive:
+ // I.e.: Project on C:\Folder\Project.cbp has file C:\Folder\SubFolder\foo.cpp and D:\Folder\bar.cpp
+ // Saved the project under D:\Folder\SubFolder\ProjectNew.cbp would cause a wrong computation of bar.cpp otherwise!!
In this case should D:\Folder be the new common top level path? I'm not sure why the old method doesn't work unless the common top level path isn't being calculated correctly. Maybe the problem is really that f->relativeFilename isn't being calculated correctly?
OK guys, I found another bottleneck in project loading!
ProjectFile* cbProject::AddFile(int targetIndex, const wxString& filename, bool compile, bool link, unsigned short int /*weight*/)
{
...
if (!wxFileExists(fullFilename))
pf->SetFileState(fvsMissing);
else if (!wxFile::Access(fullFilename.c_str(), wxFile::write)) // readonly
pf->SetFileState(fvsReadOnly);
In other words, C::B checks for the existence of ALL the project files on load! Just to set the little "broken" or readonly flags on the project tree. Imagine doing this for a ten-thousand-files java project where you only got to modify the configuration -_-.
I changed it to this:
namespace cbProjectAuxVariables {
bool check_for_existence_of_files = true;
};
...
ProjectFile* cbProject::AddFile(int targetIndex, const wxString& filename, bool compile, bool link, unsigned short int /*weight*/)
{
...
// Check for the file's status if the User chose to in the Environment settings dialog;
// or if the file is being added manually and not as part of the project loading.
if(!m_CurrentlyLoading || cbProjectAuxVariables::check_for_existence_of_files) {
if (!wxFileExists(fullFilename))
pf->SetFileState(fvsMissing);
else if (!wxFile::Access(fullFilename.c_str(), wxFile::write)) // readonly
pf->SetFileState(fvsReadOnly);
}
cbProjectAuxVariables::check_for_existence_of_files is modified on project load by checking a new Environment setting:
<object class="sizeritem">
<object class="wxCheckBox" name="chkMissingFilesOnLoad">
<label>Check for missing files on project load</label>
<tooltip>When checked, Code::Blocks will check that all the project files exist,
and update their status on the Project Tree.
Useful at times, but slower for very large projects.</tooltip>
<checked>1</checked>
</object>
<flag>wxTOP|wxLEFT|wxEXPAND|wxALIGN_LEFT|wxALIGN_TOP</flag>
<border>8</border>
</object>
And here's the result, with my previous patch and the file checking on load disabled:
1927 files loaded
Done loading project in 365ms
Project's base path: C:\htdocs\SIG\SGPG\
C:\projects\cb\src\sdk\cbproject.cpp::void cbProject::CalculateCommonTopLevelPath():539 took : 50 ms
Project's common toplevel path: C:\htdocs\SIG\SGPG\
C:\projects\cb\src\sdk\cbproject.cpp::void cbProject::CalculateCommonTopLevelPath():579 took : 130 ms
C:\projects\cb\src\sdk\cbproject.cpp::void cbProject::CalculateCommonTopLevelPath():580: Num Files: 1927; Num Cache Misses: 0
(Total project loading time: 2.50s) 8) ;D 8) ;D 8)
I'm attaching the full patch, including my previous changes. Enjoy.
[attachment deleted by admin]
Sorry, wasn't clear. What I meant was whenever you have a cache miss, instead of using MakeRelativeTo in all cases, use the faster method when possible.
Ah, my apologies. I forgot to explain how my Cache works, and what I consider a "cache miss".
It's not really a full path cache; it's a PARENT PATH Cache. And it's recursive :D
Let's say that your filename has a full path + name like this:
C:\myProjects\myProject1\src\plugins\myplugin\includes\myheader.h
What my cache does is searching first for the parent path, and if it doesn't find it, go to its own parent path, and so on, like this:
C:\myProjects\myProject1\src\plugins\myplugin\includes
C:\myProjects\myProject1\src\plugins\myplugin
C:\myProjects\myProject1\src\plugins
C:\myProjects\myProject1\src
C:\myProjects\myProject1
At that point, C:\myProjects\myProject1 is actually our project common top level path. So it's found (which is not considered a cache miss, because we can construct the relative path without resorting to wxFilename's functions). Then, as the cache returns from its call stack, adds the items in the reverse order in which they were searched:
C:\myProjects\myProject1\src => src
C:\myProjects\myProject1\src\plugins => src\plugins
C:\myProjects\myProject1\src\plugins\myplugin => src\plugins\myplugin
C:\myProjects\myProject1\src\plugins\myplugin\includes => src\plugins\myplugin\includes
C:\myProjects\myProject1\src\plugins\myplugin\includes\myheader.h => src\plugins\myplugin\includes\myheader.h (not added to the cache, just calculated)
The actual filename, C:\myProjects\myProject1\src\plugins\myplugin\includes\myheader.h is never added to the cache, because it will only be requested ONCE. However,
C:\myProjects\myProject1\src\plugins\myplugin\includes\myheader2.h
C:\myProjects\myProject1\src\plugins\myplugin\includes\myheader3.h
C:\myProjects\myProject1\src\plugins\myplugin\includes\myheader4.h
WILL be requested. Their common denominator is their path, which is already cached. And so are their parent directories.
This means that at most, ONE cache miss will be present: The project's common top level path itself, but that's already added by default! 8)
The real cache misses will be files outside the project's common top level path, which require using wxFilename anyway, so they're not added to the cache. BUT I might add them, I just wasn't sure if adding them would trigger some misbehavior in the cache...
I just realized something.
With minor modifications to my cache's constructor, we can add all the parent paths from the project's CTLP to the root directory, and we will cover ALL cases!
C:\projects\myProject1 => ''
C:\projects => '..'
C:\ => '..\..'
which means that
C:\somepathoutside\someotherpath\whatever\myexternalfile.cpp => ..\..\somepathoutside\someotherpath\whatever\myexternalfile.cpp
At this point, using wxFilename won't be necessary because it's already certain that the file we're searching for and our project reside in the same unit (that's the big "if" that was added in the latest SVN, for cases with different units, the absolute paths are added verbatim). And since they're in the same unit, we have no cache misses AT ALL! For ANY FILE, either inside or outside our CTLP.
(The only caveat is if we have a symlink circular reference around... I'd need to think about how to solve that)
I've been learning about test-driven development recently.... I think we can effectively build a test suite together:
a) Get a dev with *nix to open a C::B project.
b) Add a debugging to print out the result of the conversion, using exclusively the wxFileName:
i.e.
Relative Common Top Level Path: _______
Base Path: ___________
List of paths and filenames:
/somepath/someotherpath/myfilename.ext
/somepath/someotherpath/anotherfilename.ext
...
'/somepath/someotherpath/myfilename.ext (CTLP)' => 'someotherpath/myfilename.ext'
'/somepath/someotherpath (CTLP)' => 'someotherpath'
'/somepath/someotherpath/myfilename.ext (Base Path)' => 'myfilename.ext'
'/athirdpath/myfilename.ext (CTLP)' => '../athirdpath/myfilename.ext'
'/athirdpath' => '../athirdpath'
And so on. Then, we can replicate the wxFilename's conversion for said path by simply storing the results in a string map.
c) Refactor our code so that it will be able to be covered by unit tests.
// This will be our interface for invoking wxFilename::MakeRelativeTo();
// also our interface for our wxFakeFilename so that we can pass it to our Cached Make Relative function.
class cbFilenameConverterInterface {
abstract void setBasePath(const wxString& basePath);
abstract wxString MakeRelative(const wxString& path1, const wxString& path2);
virtual void setFallBack(cbFilenameConverterInterface* fallback) {} // For cache misses
virtual void setPathSeparator(wxChar sep) {}
}
// Instead of directly calling wxFilename::MakeRelative, we'll use this.
class cbSlowButSureMakeRelative : public cbFilenameConverterInterface {
void setBasePath(const wxString& basePath) {} // wxFilename doesn't work this way
void wxString MakeRelative(const wxString& path1, const wxString& path2) {
wxFilename filename(path1);
return filename.makeRelativeTo(path2); // Or whatever
}
}
class MyPathCache: public cbFilenameConverterInterface {
...
// TODO: I need to modify my Path Cache code to adapt to this interface.
}
d) Add our mock converter for test cases.
class cbMockFilename: public cbFilenameConverterInterface {
public:
void setBasePath(const wxString& basePath);
wxString MakeRelative(const wxString& path1, const wxString& path2);
// Use this to insert all the results obtained via wxFilename.
// I need to modify my path cache to output (in debug) all the times wxFilename is invoked so that we can later insert them manually.
void insertTestCase(const wxString& basePath, const wxString& path1, const wxString& path2, const wxString& result);
void clear();
protected:
wxString getTestCase(const wxString& basePath, const wxString& path1, const wxString& path2);
wxString m_BasePath;
bool m_CaseNotFound; // set or Reset everytime getTestCase is invoked.
// TODO: Insert here storage objects for our test cases
}
wxString cbMockFilename::MakeRelative(const wxString& path1, const wxString& path2) {
wxString result = this->getTestCase(this->m_BasePath, path1,path2);
if(this->m_CaseNotFound { // We should have a list of various basepaths in our test suit
// ALL test cases should have their correct results covered! (We can't test a function when we don't know what it's supposed to return)
debug("Error! '%s' -> '%s' are not covered by our test case!", path1, path2);
return wxString(_T('*ERROR*'));
} else {
return result;
}
}
e) We have everything ready to test!
(pseudocode)
void MakeRelativeUnitTest(const wxString basePath, vector<wxString>ourFilenames,map<wxString, wxString>ourExpectedResults, const wxChar sep, wxFilenameConverterInterface* func) {
bool testPassed = true;
cbFilenameConverterInterface* ourExperimentalObject = new MyPathCache();
cbFilenameConverterInterface* ourMockObject = new cbMockFilename();
ourExperimentalObject->setBasePath(basePath);
ourExperimentalObject->setPathSeparator(sep);
ourMockObject->setBasePath(basePath);
ourMockObject->setPathSeparator(sep);
ourExperimentalObject->setFallBack(ourMockObject); // Needed for files outside the CTLP.
wxString actualResult, expectedResult;
for(it = ourFilenames.begin(); it != ourFilenames.end(); ++it) {
expectedResult = cbMockFilename(*it);
actualResult = ourExperimentalClass->makeRelative(*it);
if(actualResult != expectedResult) {
debug("Test failed for: '%s'! Expected result: '%s'; Actual result: '%s',*itexpectedResult,actualResult);
testPassed = false;
break;
}
}
if(testPassed) {
debug("Test passed! Our Experimental Object is safe to use.");
}
}
IMHO, we should have done this years ago. So that the next time we get a rare bug, all we have to do is add it to our test cases; this way we can catch regression bugs much sooner.