Author Topic: small refactoring in globals.cpp (Read 18576 times)

frithjofh · « **Reply #15 on:** February 05, 2016, 11:45:35 am »

@obfuscated

the change to StartsWith() was merely visual. i tried both versions and the StartsWith() seemed to me to result in a cleaner image to the eye, making it easier and faster to grasp that there are three separate conditions...

as I said, only a personal impression... blame my eyesight

frithjofh · « **Reply #16 on:** February 05, 2016, 12:51:20 pm »

please revise and test attached corrected patch...

MortenMacFly · « **Reply #17 on:** February 05, 2016, 01:42:54 pm »

Quote from: frithjofh on February 05, 2016, 12:51:20 pm

please revise and test attached corrected patch...

Sorry, but for me, this patch does not apply on SVN head, unfortunately.

frithjofh · « **Reply #18 on:** February 05, 2016, 01:48:33 pm »

i will correct it...

oBFusCATed · « **Reply #19 on:** February 05, 2016, 09:00:33 pm »

Quote from: frithjofh on February 05, 2016, 11:41:39 am

doing three separate iterations on the array results in the code running much faster due to more cache locality.

The code runs faster because you've removed the calls to wxString::Remove.
This call is removing characters from the beginning of the string and copying the rest of the string to the front.
So the improvement is not because of better cache locality, but because there is less copying.

Quote from: frithjofh on February 05, 2016, 11:41:39 am

for instance, the original version repeated the test for the value of trimspaces in each iteration. now it is done only once.

Probably the compiler has re-written the loop and has extracted the test outside of it.
I'm sure that if you remove the check from the original version it won't change its performance!

Quote from: frithjofh on February 05, 2016, 11:41:39 am

same for the third iteration of removing empty entries.

Why do you need to add elements and then remove them?

Quote from: frithjofh on February 05, 2016, 11:41:39 am

the part of the extra whole copy I don't understand. do you mean in the return statement?

The return statement is doing a redundant copy.

p.p. this post is made before I've looked at the latest patch.

frithjofh · « **Reply #20 on:** February 05, 2016, 09:25:27 pm »

here the corrected patch. search and destroy please...

@obfuscated:

Quote

The code runs faster because you've removed the calls to wxString::Remove.
This call is removing characters from the beginning of the string and copying the rest of the string to the front.
So the improvement is not because of better cache locality, but because there is less copying.

agree about the removing of wxString::Remove. the most gain lies here in avoiding the copying and also in the avoiding of modification of the wxString in place. building the output piece by piece is faster. as to cache locality: I was just making an assumption, nothing more. I am no professional programmer, I did not measure cache misses.

Quote

Probably the compiler has re-written the loop and has extracted the test outside of it.
I'm sure that if you remove the check from the original version it won't change its performance!

I wouldn't know, as I did not measure this particular part of the code before and after. perhaps the compiler does this optimization on its own, perhaps not, I didn't look at the generated machine code. if there is a performance gain here, i think it will stand back behind the gain from avoiding the copying above anyway. by lengths. I find the new version easier to read though and easier to reason about.

Quote

Why do you need to add elements and then remove them?

because as above, I think it makes it a lot easier to read and reason about. and strictly spoken, the elements are not removed from the container locale to the function, just copied around (and in some future only moved around). the return value is an unnamed object constructed directly from the resulting new range of valid entries.

Quote

The return statement is doing a redundant copy.

correct, that is what some former version of the patch did. I took up the hint given and changed it accordingly.

@all:

thx for all the input. thx for not ignoring my efforts all together. apologies for my beginner mistakes.

please give me more bashing, so I will eventually learn and improve ...

oBFusCATed · « **Reply #21 on:** February 06, 2016, 04:47:00 pm »

I've done some benchmarking and profiling.
orig - current version in codeblocks
forum - your version + a fix when find returns npos
mine - obviously this is my implementation

The results printed by google's benchmark lib:

Code

> bin/release/wx_routine_bench
Success: 13 tests passed.
Test time: 0.00 seconds.
Run on (4 X 3599.88 MHz CPU s)
2016-02-06 17:15:06
Benchmark                    Time(ns)    CPU(ns) Iterations
-----------------------------------------------------------
BM_orig/1                         742        741     958904
BM_orig/8                        7456       7457      93333
BM_orig/64                      85563      85504       8140
BM_orig/512                   2997799    3000000        233
BM_orig/4k                  173072815  173000000          4
BM_orig/32k                28505832434 28504000000          1
BM_orig/64k                139696512699 139686000000          1
BM_forum/1                        774        774     886076
BM_forum/8                       5031       5029     140000
BM_forum/64                     33001      32993      21277
BM_forum/512                   256982     256831       2745
BM_forum/4k                   2011854    2014368        348
BM_forum/32k                 19614743   19611111         36
BM_forum/64k                 43754816   43764706         17
BM_mine/1                         246        246    2834008
BM_mine/8                        1321       1320     530303
BM_mine/64                       9476       9481      74468
BM_mine/512                     69313      69295      10145
BM_mine/4k                     537183     536715       1321
BM_mine/32k                   6133614    6135135        111
BM_mine/64k                  12912220   12907407         54
BM_mine_vector/1                  217        217    3225806
BM_mine_vector/8                 1447       1445     479452
BM_mine_vector/64                9060       9064      77778
BM_mine_vector/512              68974      68972      10294
BM_mine_vector/4k              539424     538986       1321
BM_mine_vector/32k            6113907    6115044        113
BM_mine_vector/64k           13649380   13666667         51
BM_no_trim_orig/1                 383        383    1861702
BM_no_trim_orig/8                4165       4166     168269
BM_no_trim_orig/64              61783      61825      11290
BM_no_trim_orig/512           2833573    2829960        247
BM_no_trim_orig/4k          167948902  168000000          4
BM_no_trim_orig/32k        29410523891 29410000000          1
BM_no_trim_orig/64k        137438740492 137432000000          1
BM_no_trim_forum/1                357        357    1949861
BM_no_trim_forum/8               1751       1753     400000
BM_no_trim_forum/64             11392      11384      61404
BM_no_trim_forum/512            85726      85749       8140
BM_no_trim_forum/4k            676759     676385       1029
BM_no_trim_forum/32k          8598947    8597561         82
BM_no_trim_forum/64k         19704975   19714286         35
BM_no_trim_mine/1                 250        250    2788845
BM_no_trim_mine/8                1527       1528     457516
BM_no_trim_mine/64              10476      10470      66667
BM_no_trim_mine/512             78441      78449       8974
BM_no_trim_mine/4k             606494     606684       1167
BM_no_trim_mine/32k           7209706    7218750         96
BM_no_trim_mine/64k          15034847   15021739         46
BM_no_trim_mine_vector/1          224        224    3139013
BM_no_trim_mine_vector/8         1649       1649     419162
BM_no_trim_mine_vector/64        9681       9672      72165
BM_no_trim_mine_vector/512      74088      74147       9589
BM_no_trim_mine_vector/4k      585973     585750       1207
BM_no_trim_mine_vector/32k    7304332    7305263         95
BM_no_trim_mine_vector/64k   15878234   15883721         43

See the attached file for the source code.
Notes:
1. There are cases where the original version is faster than the forum version.
2. Mine version is always faster.
3. When doing trimming it is 3x or so faster.

4. Obviously faster code is uglier

frithjofh · « **Reply #22 on:** February 06, 2016, 05:43:46 pm »

wow, you really took this to heart...

well, yes, you are right, there is one case, where the original version is faster... slipped me...

and yes, your version is faster, in every regard. it is nice to see what a real programmer can do. a little bit of competitive spirit always works wonders... three iterations over the vector == three times slower than only one

I tried to think about the sense of my changes to this particular function when benchmarking. I wondered, if testing the function for 5.000 entries would really make sense, or even be realistic. perhaps the interesting range is from 64 to 4.000 entries, at least if the string that gets tokenized is some normal file. but I don't really know.

I like your version. post a patch. it is a bit a pain in the neck to read, and it replaces 2 functions with 6, but the speed gain is great and those helper functions surely could find use in other functions too...

have you checked the user functions too which I changed with the last patch?

oBFusCATed · « **Reply #23 on:** February 06, 2016, 06:18:08 pm »

Quote from: frithjofh on February 06, 2016, 05:43:46 pm

wow, you really took this to heart...

Not really, I'm just playing with perf and also wanted to play with google.benchmark a bit.
And for sure I wanted to verify that what I'm talking about is correct.
And it turned out that the difference is not that big that I thought it will be.
The reason is that most of the time is wasted constructing wxString/std::string objects.
BTW the wx3.0 version is quite a lot slower, because it uses std::string

Quote from: frithjofh on February 06, 2016, 05:43:46 pm

have you checked the user functions too which I changed with the last patch?

Not yet.
For the getstringfromarray one my guess is that there won't be any difference:)

oBFusCATed · « **Reply #24 on:** February 06, 2016, 06:33:37 pm »

Heh, it turned out that your version of GetStringFromArray is a lot slower than the original, because you're using operator+.

frithjofh · « **Reply #25 on:** February 06, 2016, 07:03:39 pm »

Quote

Heh, it turned out that your version of GetStringFromArray is a lot slower than the original, because you're using operator+.

instead of two times append or += ?

corrected...

Quote

And it turned out that the difference is not that big that I thought it will be.

the difference between which versions? original and my version? my version and your version? I mean between your version and my version there is a factor of three, I think that is much, although in absolute numbers it is only msec scale.

frithjofh · « **Reply #26 on:** February 07, 2016, 12:19:05 pm »

corrected patch. still contains my version of GetStringFromArray() until better version from obfuscated arrives, or what ever the decision may be...

please test and comment on EscapeSpaces()

oBFusCATed · « **Reply #27 on:** February 07, 2016, 02:43:42 pm »

I'm talking about the non-trimming case. With my version it is slightly faster than your version.
The trimming version is way faster, because it doesn't do trimming, which is slow.

I've done more testing on the EscapeSpaces function and it seems that your fix is an improvement only in wx2.8. With wx3.0 it doesn't make a difference.

frithjofh · « **Reply #28 on:** February 07, 2016, 04:34:04 pm »

ok. so just leave it as is, right?

frithjofh · « **Reply #29 on:** March 14, 2016, 11:54:33 am »

@obfuscated

made another attempt on GetArrayFromString(). if you don't mind, could you give it a run in your testing harness to see its performance?

Code

wxArrayString GetArrayFromString_3(const wxString& text, const wxString& separator, bool trimSpaces)
{
    if ( text.empty() )
        return wxArrayString();

    const size_t seplen = separator.length();
    size_t start_pos = 0;
    size_t sep_pos = 0;
    wxArrayString out;

    if (trimSpaces)
    {
        while (sep_pos != wxString::npos)
        {
            start_pos = text.find_first_not_of(_T(" \t\n\r"), start_pos);
            if (start_pos == wxString::npos)
                break;
            sep_pos = text.find(separator, start_pos);
            if (sep_pos != start_pos)
            {
                const size_t end_pos = text.find_last_not_of(_T(" \t\n\r"), (sep_pos == wxString::npos) ? wxString::npos : sep_pos - 1);
                out.push_back(text.substr(start_pos, end_pos - start_pos + 1));
            }
            start_pos = sep_pos + seplen;
        }
    }
    else
    {
        while (sep_pos != wxString::npos)
        {
            sep_pos = text.find(separator, start_pos);
            if ( sep_pos != start_pos )
                out.push_back(text.substr(start_pos, (sep_pos == wxString::npos) ? wxString::npos : sep_pos - start_pos));
            start_pos = sep_pos + seplen;
            if (start_pos >= text.length())
                break;
        }
    }

    return out;
}

Code::Blocks Forums

News:

Author Topic: small refactoring in globals.cpp (Read 18576 times)

frithjofh

Re: small refactoring in globals.cpp

frithjofh

Re: small refactoring in globals.cpp

MortenMacFly

Re: small refactoring in globals.cpp

frithjofh

Re: small refactoring in globals.cpp

oBFusCATed

Re: small refactoring in globals.cpp

frithjofh

Re: small refactoring in globals.cpp

oBFusCATed

Re: small refactoring in globals.cpp

frithjofh

Re: small refactoring in globals.cpp

oBFusCATed

Re: small refactoring in globals.cpp

oBFusCATed

Re: small refactoring in globals.cpp

frithjofh

Re: small refactoring in globals.cpp

frithjofh

Re: small refactoring in globals.cpp

oBFusCATed

Re: small refactoring in globals.cpp

frithjofh

Re: small refactoring in globals.cpp

frithjofh

Re: small refactoring in globals.cpp