Developer forums (C::B DEVELOPMENT STRICTLY!) > Development

small refactoring in globals.cpp

<< < (5/7) > >>

frithjofh:
here the corrected patch. search and destroy please...  :)

@obfuscated:


--- Quote ---The code runs faster because you've removed the calls to wxString::Remove.
This call is removing characters from the beginning of the string and copying the rest of the string to the front.
So the improvement is not because of better cache locality, but because there is less copying.
--- End quote ---

agree about the removing of wxString::Remove. the most gain lies here in avoiding the copying and also in the avoiding of modification of the wxString in place. building the output piece by piece is faster. as to cache locality: I was just making an assumption, nothing more. I am no professional programmer, I did not measure cache misses.


--- Quote ---Probably the compiler has re-written the loop and has extracted the test outside of it.
I'm sure that if you remove the check from the original version it won't change its performance!
--- End quote ---

I wouldn't know, as I did not measure this particular part of the code before and after. perhaps the compiler does this optimization on its own, perhaps not, I didn't look at the generated machine code. if there is a performance gain here, i think it will stand back behind the gain from avoiding the copying above anyway. by lengths. I find the new version easier to read though and easier to reason about.


--- Quote ---Why do you need to add elements and then remove them?
--- End quote ---

because as above, I think it makes it a lot easier to read and reason about. and strictly spoken, the elements are not removed from the container locale to the function, just copied around (and in some future only moved around). the return value is an unnamed object constructed directly from the resulting new range of valid entries.


--- Quote ---The return statement is doing a redundant copy.
--- End quote ---

correct, that is what some former version of the patch did. I took up the hint given and changed it accordingly.


@all:

thx for all the input. thx for not ignoring my efforts all together. apologies for my beginner mistakes.

please give me more bashing, so I will eventually learn and improve ... ;)

oBFusCATed:
I've done some benchmarking and profiling.
orig - current version in codeblocks
forum - your version + a fix when find returns npos
mine - obviously this is my implementation

The results printed by google's benchmark lib:

--- Code: ---> bin/release/wx_routine_bench
Success: 13 tests passed.
Test time: 0.00 seconds.
Run on (4 X 3599.88 MHz CPU s)
2016-02-06 17:15:06
Benchmark                    Time(ns)    CPU(ns) Iterations
-----------------------------------------------------------
BM_orig/1                         742        741     958904
BM_orig/8                        7456       7457      93333
BM_orig/64                      85563      85504       8140
BM_orig/512                   2997799    3000000        233
BM_orig/4k                  173072815  173000000          4
BM_orig/32k                28505832434 28504000000          1
BM_orig/64k                139696512699 139686000000          1
BM_forum/1                        774        774     886076
BM_forum/8                       5031       5029     140000
BM_forum/64                     33001      32993      21277
BM_forum/512                   256982     256831       2745
BM_forum/4k                   2011854    2014368        348
BM_forum/32k                 19614743   19611111         36
BM_forum/64k                 43754816   43764706         17
BM_mine/1                         246        246    2834008
BM_mine/8                        1321       1320     530303
BM_mine/64                       9476       9481      74468
BM_mine/512                     69313      69295      10145
BM_mine/4k                     537183     536715       1321
BM_mine/32k                   6133614    6135135        111
BM_mine/64k                  12912220   12907407         54
BM_mine_vector/1                  217        217    3225806
BM_mine_vector/8                 1447       1445     479452
BM_mine_vector/64                9060       9064      77778
BM_mine_vector/512              68974      68972      10294
BM_mine_vector/4k              539424     538986       1321
BM_mine_vector/32k            6113907    6115044        113
BM_mine_vector/64k           13649380   13666667         51
BM_no_trim_orig/1                 383        383    1861702
BM_no_trim_orig/8                4165       4166     168269
BM_no_trim_orig/64              61783      61825      11290
BM_no_trim_orig/512           2833573    2829960        247
BM_no_trim_orig/4k          167948902  168000000          4
BM_no_trim_orig/32k        29410523891 29410000000          1
BM_no_trim_orig/64k        137438740492 137432000000          1
BM_no_trim_forum/1                357        357    1949861
BM_no_trim_forum/8               1751       1753     400000
BM_no_trim_forum/64             11392      11384      61404
BM_no_trim_forum/512            85726      85749       8140
BM_no_trim_forum/4k            676759     676385       1029
BM_no_trim_forum/32k          8598947    8597561         82
BM_no_trim_forum/64k         19704975   19714286         35
BM_no_trim_mine/1                 250        250    2788845
BM_no_trim_mine/8                1527       1528     457516
BM_no_trim_mine/64              10476      10470      66667
BM_no_trim_mine/512             78441      78449       8974
BM_no_trim_mine/4k             606494     606684       1167
BM_no_trim_mine/32k           7209706    7218750         96
BM_no_trim_mine/64k          15034847   15021739         46
BM_no_trim_mine_vector/1          224        224    3139013
BM_no_trim_mine_vector/8         1649       1649     419162
BM_no_trim_mine_vector/64        9681       9672      72165
BM_no_trim_mine_vector/512      74088      74147       9589
BM_no_trim_mine_vector/4k      585973     585750       1207
BM_no_trim_mine_vector/32k    7304332    7305263         95
BM_no_trim_mine_vector/64k   15878234   15883721         43

--- End code ---

See the attached file for the source code.
Notes:
1. There are cases where the original version is faster than the forum version.
2. Mine version is always faster.
3. When doing trimming it is 3x or so faster. :)
4. Obviously faster code is uglier :)

frithjofh:
wow, you really took this to heart...

well, yes, you are right, there is one case, where the original version is faster... slipped me...  ;)

and yes, your version is faster, in every regard. it is nice to see what a real programmer can do. a little bit of competitive spirit always works wonders...  three iterations over the vector == three times slower than only one :o

I tried to think about the sense of my changes to this particular function when benchmarking. I wondered, if testing the function for 5.000 entries would really make sense, or even be realistic. perhaps the interesting range is from 64 to 4.000 entries, at least if the string that gets tokenized is some normal file. but I don't really know.

I like your version. post a patch. it is a bit a pain in the neck to read, and it replaces 2 functions with 6, but the speed gain is great and those helper functions surely could find use in other functions too...

have you checked the user functions too which I changed with the last patch?



oBFusCATed:

--- Quote from: frithjofh on February 06, 2016, 05:43:46 pm ---wow, you really took this to heart...

--- End quote ---
Not really, I'm just playing with perf and also wanted to play with google.benchmark a bit.
And for sure I wanted to verify that what I'm talking about is correct.
And it turned out that the difference is not that big that I thought it will be.
The reason is that most of the time is wasted constructing wxString/std::string objects.
BTW the wx3.0 version is quite a lot slower, because it uses std::string


--- Quote from: frithjofh on February 06, 2016, 05:43:46 pm ---have you checked the user functions too which I changed with the last patch?

--- End quote ---
Not yet.
For the getstringfromarray one my guess is that there won't be any difference:)

oBFusCATed:
Heh, it turned out that your version of GetStringFromArray is a lot slower than the original, because you're using operator+.  ::)

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version