wxString support in wxWidgets 3.0 problem?

Developer forums (C::B DEVELOPMENT STRICTLY!) > CodeCompletion redesign

<< < (2/4) > >>

ollydbg:
FYI:
I found one message in wx forum:

--- Quote ---DL> It's not really thread-safe since it uses reference counting - I think,

This was true for 2.8 but this question is explicitly about 2.9 and by
default in wx 2.9 (i.e. unless you set wxUSE_STD_STRING to 0) wxString uses
std::basic_string for implementation and so doesn't use reference counting
if the standard class doesn't -- and most, if not all, of them don't use it
any more. So the thread safety of wxString is the same as the thread-safety
of the underlying standard library string class.

Regards,
VZ

--- End quote ---

So, in the future, it seems wxString 3.x/2.9.x mostly does NOT use reference counting as stl.
Then in the current Codecompletion plugin's source, there are a lot of functions like:

--- Code: ---wxString GetToken();
wxString PeekToken();

--- End code ---
These code will do a deep copy of string data, so I'm concern the performance.

PS: Under wxWidgets 2.8.x 's implementation, wxString use reference counting, so return a wxString object is much fast (it do not do a deep copy of string data)

So, what do you think?

oBFusCATed:
RValue references to the rescue :)

And as always performance optimizations should be done when there is info that something is slow!
So profile it first then optimize, then profile again to see it is faster.

ollydbg:
I just search the Google for sometime, and found that
gcc libc++'s string is COW(copy on write), see
http://stackoverflow.com/questions/1594803/is-stdstring-thead-safe-with-gcc-4-3

This code can show the COW

--- Code: ---#include <string>
#include <cstdio>

int main()
{
std::string orig = "I'm the original!";
std::string copy_cow = orig;
std::string copy_mem = orig.c_str();
std::printf("%p %p %p\n", orig.data(),
copy_cow.data(),
copy_mem.data());
}

--- End code ---

So, I think though wx does not use reference count, I think std::string use it.

Am I right??? some one can confirm this?

ollydbg:
Oh, it seems the COW will be disabled in the future c++0x
see:
http://stackoverflow.com/questions/4067395/gnu-stl-string-is-copy-on-write-involved-here

--- Quote ---Just wanted to note that copy on write is probably going to fade away in C++0x with the introduction of move semantics (makes COW obsolete for many typical use cases) and concurrency (makes COW potentially very inefficient due to synchronization issues).

--- End quote ---

and
just how bad CoW can be in a multithreaded environment, even if there's only one thread

N2668: "Concurrency Modifications to Basic String"

ollydbg:
FYI:

I see that under Linux, the wxString in wxWidgets 2.9.x now use std::basic_string<wchar_t>, the change happens around 2012-05-13, see this commit to wxWidgets' svn repo:
SVN:(VZ)[71424] Disable the use of UTF-8 by default in Unix builds. - Google Groups, it was using UTF-8 by default before this commit.

I think this is a good news, which means it will let wxWidgets have better performance when parsing. Also, directly use the wchar_t pointer is safe in either Windows and Linux, this is because all the character are occupy the same byte lengths (fixed-width encoding).

So, never mind about the issue reported in: unsafe memory copy in CC's macro replacement

EDIT: this is the current document about performance in wxString in the webpage: http://docs.wxwidgets.org/trunk/classwx_string.html

--- Quote ---Performance characteristics

wxString uses std::basic_string internally to store its content (unless this is not supported by the compiler or disabled specifically when building wxWidgets) and it therefore inherits many features from std::basic_string. In particular, most modern implementations of std::basic_string are thread-safe and don't use reference counting (making copying large strings potentially expensive) and so wxString has the same characteristics.

By default, wxString uses std::basic_string specialized for the platform-dependent wchar_t type, meaning that it is not memory-efficient for ASCII strings, especially under Unix platforms where every ASCII character, normally fitting in a byte, is represented by a 4 byte wchar_t.

It is possible to build wxWidgets with wxUSE_UNICODE_UTF8 set to 1 in which case an UTF-8-encoded string representation is stored in std::basic_string specialized for char, i.e. the usual std::string. In this case the memory efficiency problem mentioned above doesn't arise but run-time performance of many wxString methods changes dramatically, in particular accessing the N-th character of the string becomes an operation taking O(N) time instead of O(1), i.e. constant, time by default. Thus, if you do use this so called UTF-8 build, you should avoid using indices to access the strings whenever possible and use the iterators instead. As an example, traversing the string using iterators is an O(N), where N is the string length, operation in both the normal ("wchar_t") and UTF-8 builds but doing it using indices becomes O(N^2) in UTF-8 case meaning that simply checking every character of a reasonably long (e.g. a couple of millions elements) string can take an unreasonably long time.

However, if you do use iterators, UTF-8 build can be a better choice than the default build, especially for the memory-constrained embedded systems. Notice also that GTK+ and DirectFB use UTF-8 internally, so using this build not only saves memory for ASCII strings but also avoids conversions between wxWidgets and the underlying toolkit.

--- End quote ---

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version