disclaimer: This is NOT intended to be inflammatory, nor particularly defensive, just informative. Please read it with that mindset.
Have you profiled the code?
No, not in this context.
What improvement in performance do we get by this change?
Probably less than 15% yes, but it is combined unnecessary actions like this that add up throughOUT a program to make it slower, and profiling can't really reveal the effect of those, because they're so spread out. Add'l heap fragmentation also tends to slow down everybodies allocations over time. (I've worked on this on a machine that can see c::b/GDB take approx. 20-30 minutes to load c::b for debugging - when I reboot the machine it takes <= 30 seconds. As best I can tell, its probably because of OS virtual memory address space fragmentation, that is eliminated by a reboot.)
While the total disassembly time improvement is likely less than 15%, I'd estimate the local operation percentage gain (of orig. string concat vs .insert) to be possibly 50% gain, but if you like I'll consider* trying to cobble together something to demonstrate this - although it'll be a little harder to prove in a single-threaded context, whereas the effect can be more pronounced in multi-threaded contexts where multiple threads may be thrashing the heap with allocations. (c::b does seem to have multiple threads doing something.) (*Not that I really want to spend my time proving this, for something I know is a wise practice, when the product itself could benefit from further changes.)
I might also note that the original addition of the check for the leading '0x' might have been avoided, with the code and formats being modified for all environments. Of course the trade-off is consistency in all environments (windows vs linux vs ?), vs faster performance vs code maintainability vs risk of changes made to solve original problem. (The current code is probaby 'faster' in linux, because it doesn't have to perform the extra object allocation, both environments have overhead of checking to see if '0x' there to begin with, slower in windows where the object construction and concat are performed, and those lines are less 'maintainable' as there is an 'exception' path ["Is there a leading 0x? No, add one"] that nothing in the code currently indicates the reason for. (Its also dependent on output that is obviously currently different in different environments, and thus probably subject to future change, since it might reasonably be expected to be the same.) (Modification might have been to eliminate use of %p, just use %x and add leading literal '0x' as in '0x%08x', but that could be a problem for 64-bit environs, if that's an issue.) I saw what I thought was an opportunity for 'low-hanging fruit', albeit small in larger context, and took it.
If it is less than 15% (I'm sure it is less), please leave it as it is.
OK if you like, but see previous note in this response.Also in c++0x this won't be an issue (I think).
Why not, what does c++0x provide to help this?
When will c::b be distributed in c++0x (in all environs it serves)?
p.s. There is one general rule: premature optimization is the root of all evil.
Scattered failures to avoid unnecessary actions can result in 'fat' that cannot be easily detected or optimized away, short of a rewrite that avoids the unnecessary actions.
p.p.s. memory allocations are not problem on the PC for most GUI applications
The disassembly process itself is really a 'batch' operation within a GUI. Try stepping into a very 'large' routine, and the pause waiting for disassembly is noticeable, and a bit long for pleasant GUI responsiveness.
p.p.p.s. I don't need to look into the assembly to fix most of the problems I encounter and I don't do much hardcore optimization work.
likewise, but I do at times find it indispensable, particularly when trying to debug optimized code that has failed. I tend to prefer mostly working with code as it will be shipped, because I've chased too many problems (multi-threaded, timing-related problems) that simply would not show up in the non-optimized ('debug') code versions. (And no, I've not generally been responsible for creating most of those problems, merely the one stuck with correcting them.) Figuring out what's happening without looking at the disassembly of the optimized code has not been something I could accomplish.