@obfuscated
doing three separate iterations on the array results in the code running much faster due to more cache locality. my measurements confirmed the speedup. the original version continuously jumped from one context to another. putting each action in a separate loop is just faster.
for instance, the original version repeated the test for the value of trimspaces in each iteration. now it is done only once.
same for the third iteration of removing empty entries.
the part of the extra whole copy I don't understand. do you mean in the return statement? if so, that would happen in the original version too. or is there another extra copy that slipped me? are you looking at the last version of the patch? the one before had the problem of the extra copy in the std:vector version of the function, but I already changed that.
and if there would be an extra copy, wouldn't the gained speed justify it?
as to your last point: hit and sunk. my functions have a problem with the case where there is no separator at the end. it just omits the part after the last separator. I will correct it, slipped me, thx for the find.