Author Topic: F() function is not thread safe?  (Read 28957 times)

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5906
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
F() function is not thread safe?
« on: January 09, 2014, 04:39:03 am »
When tracking the dirty lockers in CC, I hit a crash, which has the following back-trace

Code
[debug]> bt 30
[debug]#0  0x77c47fd4 in wcslen () from C:\WINDOWS\system32\msvcrt.dll
[debug]#1  0x01d37e3f in wxPrintfConvSpec<wchar_t>::Process(wchar_t*, unsigned int, wxPrintfArg*, unsigned int) () from E:\code\wx\wxWidgets-3.0.0\lib\gcc_dll\wxmsw30u_gcc_cb.dll
[debug]#2  0x0148362e in int wxDoVsnprintf<wchar_t>(wchar_t*, unsigned int, wchar_t const*, char*) () from E:\code\wx\wxWidgets-3.0.0\lib\gcc_dll\wxmsw30u_gcc_cb.dll
[debug]#3  0x01481de1 in wxVsnprintf(wchar_t*, unsigned int, wxString const&, char*) () from E:\code\wx\wxWidgets-3.0.0\lib\gcc_dll\wxmsw30u_gcc_cb.dll
[debug]#4  0x0144f9cc in DoStringPrintfV(wxString&, wxString const&, char*) () from E:\code\wx\wxWidgets-3.0.0\lib\gcc_dll\wxmsw30u_gcc_cb.dll
[debug]#5  0x0144fd02 in wxString::PrintfV(wxString const&, char*) () from E:\code\wx\wxWidgets-3.0.0\lib\gcc_dll\wxmsw30u_gcc_cb.dll
[debug]#6  0x0144fd54 in wxString::FormatV(wxString const&, char*) () from E:\code\wx\wxWidgets-3.0.0\lib\gcc_dll\wxmsw30u_gcc_cb.dll
[debug]#7  0x051ecd47 in F (msg=0x9178d44 L"%lss.Lock().Success() :%lsls()%ls%ls, %d") at F:\cb_sf_git\trunk\src\include\logmanager.h:26
[debug]#8  0x051c4484 in Parser::IsFileParsed (this=0x88c8b70, filename=...) at F:\cb_sf_git\trunk\src\plugins\codecompletion\parser\parser.cpp:1168
[debug]#9  0x05174dc7 in CodeCompletion::ParseFunctionsAndFillToolbar (this=0x4400320) at F:\cb_sf_git\trunk\src\plugins\codecompletion\codecompletion.cpp:3444
[debug]#10 0x0517782d in CodeCompletion::OnToolbarTimer (this=0x4400320, event=...) at F:\cb_sf_git\trunk\src\plugins\codecompletion\codecompletion.cpp:3836
[debug]#11 0x013a2e62 in wxAppConsoleBase::HandleEvent(wxEvtHandler*, void (wxEvtHandler::*)(wxEvent&), wxEvent&) const () from E:\code\wx\wxWidgets-3.0.0\lib\gcc_dll\wxmsw30u_gcc_cb.dll
[debug]#12 0x013a31cd in wxAppConsoleBase::CallEventHandler(wxEvtHandler*, wxEventFunctor&, wxEvent&) const () from E:\code\wx\wxWidgets-3.0.0\lib\gcc_dll\wxmsw30u_gcc_cb.dll
[debug]#13 0x014e1b21 in wxEvtHandler::ProcessEventIfMatchesId(wxEventTableEntryBase const&, wxEvtHandler*, wxEvent&) () from E:\code\wx\wxWidgets-3.0.0\lib\gcc_dll\wxmsw30u_gcc_cb.dll
[debug]#14 0x014e1e86 in wxEvtHandler::SearchDynamicEventTable(wxEvent&) () from E:\code\wx\wxWidgets-3.0.0\lib\gcc_dll\wxmsw30u_gcc_cb.dll
[debug]#15 0x014e1f26 in wxEvtHandler::TryHereOnly(wxEvent&) () from E:\code\wx\wxWidgets-3.0.0\lib\gcc_dll\wxmsw30u_gcc_cb.dll
[debug]#16 0x014e1fcd in wxEvtHandler::ProcessEventLocally(wxEvent&) () from E:\code\wx\wxWidgets-3.0.0\lib\gcc_dll\wxmsw30u_gcc_cb.dll
[debug]#17 0x014e2023 in wxEvtHandler::ProcessEvent(wxEvent&) () from E:\code\wx\wxWidgets-3.0.0\lib\gcc_dll\wxmsw30u_gcc_cb.dll
[debug]#18 0x014e1db0 in wxEvtHandler::SafelyProcessEvent(wxEvent&) () from E:\code\wx\wxWidgets-3.0.0\lib\gcc_dll\wxmsw30u_gcc_cb.dll
[debug]#19 0x01462df8 in wxTimerImpl::SendEvent() () from E:\code\wx\wxWidgets-3.0.0\lib\gcc_dll\wxmsw30u_gcc_cb.dll
[debug]#20 0x014ca912 in wxTimerWndProc(HWND__*, unsigned int, unsigned int, long)@16 () from E:\code\wx\wxWidgets-3.0.0\lib\gcc_dll\wxmsw30u_gcc_cb.dll
[debug]#21 0x7e418734 in USER32!GetDC () from C:\WINDOWS\system32\user32.dll
[debug]#22 0x000e0a12 in ?? ()
[debug]#23 0x7e418816 in USER32!GetDC () from C:\WINDOWS\system32\user32.dll
[debug]#24 0x014ca880 in wxProcessTimer(wxMSWTimerImpl&) () from E:\code\wx\wxWidgets-3.0.0\lib\gcc_dll\wxmsw30u_gcc_cb.dll
[debug]#25 0x00000113 in ?? ()
[debug]#26 0x7e4189cd in USER32!GetWindowLongW () from C:\WINDOWS\system32\user32.dll
[debug]#27 0x00000000 in ?? ()

Track down to the file here:
Code
/*
 * This file is part of the Code::Blocks IDE and licensed under the GNU Lesser General Public License, version 3
 * http://www.gnu.org/licenses/lgpl-3.0.html
 */

#ifndef LOGMGR_H
#define LOGMGR_H

#include "manager.h"
#include "logger.h"
#include <map>

//namespace cb
//{
    inline wxString F(const wxChar* msg, ...)
    {
        va_list arg_list;
        va_start(arg_list, msg);
#if wxCHECK_VERSION(2,9,0) && wxUSE_UNICODE
// in wx >=  2.9 unicode-build (default) we need the %ls here, or the strings get
// cut after the first character
        ::temp_string = msg;
        ::temp_string.Replace(_T("%s"), _T("%ls"));
        msg = ::temp_string.wx_str();
#endif
        ::temp_string = wxString::FormatV(msg, arg_list); //******************************************
        va_end(arg_list);

        return ::temp_string;
    }
//} // namespace cb

It use a global wxString, this is too bad as I can see, especially in a multiply thread application.

I see that one thread is calling here:
Code
bool Parser::IsFileParsed(const wxString& filename)
{
    bool isParsed = false;

    CC_LOCKER_TRACK_TT_MTX_LOCK(s_TokenTreeMutex) //***********************************here

    isParsed = m_TokenTree->IsFileParsed(filename);

Another thread is calling here:
Code
bool Parser::Parse(const wxString& filename, bool isLocal, bool locked, LoaderBase* loader)
{
    ParserThreadOptions opts;

    opts.useBuffer             = false;
    opts.bufferSkipBlocks      = false;
    opts.bufferSkipOuterBlocks = false;

    opts.followLocalIncludes   = m_Options.followLocalIncludes;
    opts.followGlobalIncludes  = m_Options.followGlobalIncludes;
    opts.wantPreprocessor      = m_Options.wantPreprocessor;
    opts.parseComplexMacros    = m_Options.parseComplexMacros;

    opts.storeDocumentation    = m_Options.storeDocumentation;

    opts.loader                = loader; // maybe 0 at this point

    bool result = false;
    do
    {
        bool canparse = false;
        {
            if (!locked)
                CC_LOCKER_TRACK_TT_MTX_LOCK(s_TokenTreeMutex)

            canparse = !m_TokenTree->IsFileParsed(filename);
            if (canparse)
                canparse = m_TokenTree->ReserveFileForParsing(filename, true) != 0;

            if (!locked)
                CC_LOCKER_TRACK_TT_MTX_UNLOCK(s_TokenTreeMutex) //**********************************here
        }
So, they are both calling the same function.

Maybe a wx native log function is better? like wxLogMessage(), which support the printf style format string.

What's your opinion?

« Last Edit: January 09, 2014, 07:12:24 am by ollydbg »
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5906
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: F() function is not thread safe?
« Reply #1 on: January 09, 2014, 06:34:09 am »
Found another issue, static variables in header files, see, the logger.h has those contents.

Code
/*
 * This file is part of the Code::Blocks IDE and licensed under the GNU Lesser General Public License, version 3
 * http://www.gnu.org/licenses/lgpl-3.0.html
 */

#ifndef LOGGER_H
#define LOGGER_H

#include "prep.h"
#include <wx/string.h>
#include "settings.h" // DLLIMPORT

class wxMenu;
class wxWindow;

namespace
{
    static wxString temp_string(_T('\0'), 250);
    static wxString newline_string(_T("\n"));
}

So, mostly, F() from  different translation units have their own temp_string.

But look at my crash issue, two different thread call functions in a single translation unit (both in parser.cpp), so it crashed.
« Last Edit: January 09, 2014, 07:12:43 am by ollydbg »
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline thomas

  • Administrator
  • Lives here!
  • *****
  • Posts: 3979
Re: F() function is not thread safe?
« Reply #2 on: January 09, 2014, 01:11:00 pm »
Function named F found in a file that contains the string LOGGER looks typically like something I may have written some years ago as most for internal use. Though #if wxCHECK_VERSION(...) suggests that some edits have been made.

Though I don't remember now what the intent was, need to look (and try to understand) first. I think the whole purpose of this function was for formatting debug messages with a timestamp or something before they are appended to the log.

I don't remember why this function was designed in such an obtuse way, I don't think anything returned by F should need to be modified afterwards. But quite possibly, that was the very reason for this design (I seem to remember that the use of one static per translation unit was a deliberate choice to make it "quasi-threadsafe").
If the returned strings are not needed to be modifiable, one should be able to simply return a const& to a local object, which would make F threadsafe.
« Last Edit: January 09, 2014, 01:26:49 pm by thomas »
"We should forget about small efficiencies, say about 97% of the time: Premature quotation is the root of public humiliation."

Offline thomas

  • Administrator
  • Lives here!
  • *****
  • Posts: 3979
Re: F() function is not thread safe?
« Reply #3 on: January 09, 2014, 06:00:02 pm »
This seems to work fine (at least it doesn't work worse than the pristine sources for me, no compile error and no crash):

Code
Index: include/logmanager.h
===================================================================
--- include/logmanager.h (revision 9550)
+++ include/logmanager.h (working copy)
@@ -10,25 +10,32 @@
 #include "logger.h"
 #include <map>
 
-//namespace cb
-//{
-    inline wxString F(const wxChar* msg, ...)
+ namespace pfffffffrt
+ {
+ // avoid calling wxString::wxString(const char*) twice for every single message
+ static const wxString  s(_T("%s"));
+ static const wxString ls(_T("%ls"));
+ }
+
+    inline wxString const& F(const wxChar* msg, ...)
     {
         va_list arg_list;
         va_start(arg_list, msg);
+
 #if wxCHECK_VERSION(2,9,0) && wxUSE_UNICODE
-// in wx >=  2.9 unicode-build (default) we need the %ls here, or the strings get
-// cut after the first character
-        ::temp_string = msg;
-        ::temp_string.Replace(_T("%s"), _T("%ls"));
-        msg = ::temp_string.wx_str();
+ // in wx >=  2.9 unicode-build (default) we need the %ls here, or the strings get
+ // cut after the first character
+ using namespace pfffffffrt;
+        wxString tmp(msg);
+        tmp.Replace(s, ls);
+        tmp = wxString::FormatV(tmp.wx_str(), arg_list)
+#else
+        wxString tmp(wxString::FormatV(msg, arg_list));
 #endif
-        ::temp_string = wxString::FormatV(msg, arg_list);
+
         va_end(arg_list);
-
-        return ::temp_string;
+        return tmp;
     }
-//} // namespace cb
 
 
 struct LogSlot

There is however another, more serious issue insofar as the text control loggers (which I didn't write, luckily, so I can blame Yiannis!) seem to use those globals all over the place, too. For no apparent reason as well, no idea why it's done that way. We'll probably need to search through those too.
"We should forget about small efficiencies, say about 97% of the time: Premature quotation is the root of public humiliation."

Offline MortenMacFly

  • Administrator
  • Lives here!
  • *****
  • Posts: 9694
Re: F() function is not thread safe?
« Reply #4 on: January 09, 2014, 06:37:30 pm »
This seems to work fine (at least it doesn't work worse than the pristine sources for me, no compile error and no crash):
Its always fun to see how you manage to make the code more trustfully and readable... pfffffffrt... ;-)
Compiler logging: Settings->Compiler & Debugger->tab "Other"->Compiler logging="Full command line"
C::B Manual: https://www.codeblocks.org/docs/main_codeblocks_en.html
C::B FAQ: https://wiki.codeblocks.org/index.php?title=FAQ

Offline thomas

  • Administrator
  • Lives here!
  • *****
  • Posts: 3979
Re: F() function is not thread safe?
« Reply #5 on: January 09, 2014, 07:49:07 pm »
Well, that's a namespace which you're not supposed to use, ever. And sure I wouldn't want a variable called s in global namespace.  :D
It's only there at all to provide those two static strings for the function that follows, so we're not constructing these strings a thousand times per second.

Though of course we do a few other things in the logger which aren't precisely high-perforance, such as outputting single newlines and changing style twice in between... that generates about a dozen window messages inside wxWidgets per line of output (yuck). But I guess we'll leave that for another day.

Calling the namespace pfffffffrt seems discouraging enough, doesn't it?  I was tempted to call it _________ for a while, at the risk of being yelled at because double-underscore names are reserved ;D
« Last Edit: January 09, 2014, 07:51:16 pm by thomas »
"We should forget about small efficiencies, say about 97% of the time: Premature quotation is the root of public humiliation."

Offline oBFusCATed

  • Developer
  • Lives here!
  • *****
  • Posts: 13413
    • Travis build status
Re: F() function is not thread safe?
« Reply #6 on: January 09, 2014, 09:16:26 pm »
Some C++-devs use detail for such namespaces.
Although, why would you want to have s and sl as global static consts in the first place?

Another problem with your patch is that you're using tabs instead of spaces  :P
And the final problem, the F should be changed to return wxString instead of const wxString&.

p.s. this function is perfect candidate for RVO, so probably the compiler omits the copying.
(most of the time I ignore long posts)
[strangers don't send me private messages, I'll ignore them; post a topic in the forum, but first read the rules!]

Offline thomas

  • Administrator
  • Lives here!
  • *****
  • Posts: 3979
Re: F() function is not thread safe?
« Reply #7 on: January 09, 2014, 09:56:47 pm »
As explained, I've put in these variables because someone is using wxString::Replace with a const char* argument for some weird wx 2.9 compatibility hack. I don't even get why that hack is necessary, but obviously the person who thought of it had a good reason for it.

Unluckily, wxString::Replace takes wxString arguments, not const char*, which means that for every string you format, you create two unnecessary temporaries just to do this replacement. Of course you can easily do without these, and construct/destruct even a few more objects. But seeing how it's happening many thousands of times, I deemed it worthwhile. We can of course do without if you don't like it.

So what name other than s would better describe a constant string with the contents "s" :)

As for tabs... yeah, sorry... I have the editor set to use tabs, keep forgetting that C::B sources have the silly "use spaces" notation. About RVO I'm not sure if it works. Unluckily we can't return an anonymous temporary because of var_end which must appear before return but after any var args are consumed (i.e. to construct the returned object). Not sure if the compiler is smart enough to elide the copy anyway.
"We should forget about small efficiencies, say about 97% of the time: Premature quotation is the root of public humiliation."

Offline oBFusCATed

  • Developer
  • Lives here!
  • *****
  • Posts: 13413
    • Travis build status
Re: F() function is not thread safe?
« Reply #8 on: January 10, 2014, 12:50:15 am »
So what name other than s would better describe a constant string with the contents "s" :)
You can make them local functions statics, I think...

Not sure if the compiler is smart enough to elide the copy anyway.
Compiler writers, tells us that they are smart enough... http://cpp-next.com/archive/2009/08/want-speed-pass-by-value/
Do you agree that the return type should be changed?
(most of the time I ignore long posts)
[strangers don't send me private messages, I'll ignore them; post a topic in the forum, but first read the rules!]

Offline thomas

  • Administrator
  • Lives here!
  • *****
  • Posts: 3979
Re: F() function is not thread safe?
« Reply #9 on: January 10, 2014, 12:18:34 pm »
You can make them local functions statics, I think...
Yes, except function local statics effectively result in a sort of if(initialized) foo = initializer; code sequence, and in C++11 these are required to be initialized thread-safe, and GCC since about version 4.6 or 4.7 does that regardless of C++11 anyway (unless you compile with -fno-threadsafe-statics), using a mutex. Now of course, locking a mutex makes the point of trying to avoid initilizing a string somewhat... redundant.

Quote
Not sure if the compiler is smart enough to elide the copy anyway.
Compiler writers, tells us that they are smart enough... http://cpp-next.com/archive/2009/08/want-speed-pass-by-value/
Do you agree that the return type should be changed?
I'm know this article, but I'm not convinced that it applies to us. For example, we don't follow the guideline of not copying arguments (obviously) and your return value is not anonymous, so at best NRVO can be used, not RVO. Also, var-args are a kind of ugly hack, and I'm not sure inhowfar it influences optimizations such as NRVO.

But I'm OK if you think that returning by value is a better choice, even if it does create a copy. A verbatim copy on a wxString is only incrementing a reference counter.

Feel free to choose whatever makes you feel most comfortable.

What worries me more than those micro-optimizations is code like this:
Code
void TextCtrlLogger::Append(const wxString& msg, Logger::level lv)
{
    if (!control)
        return;

    ::temp_string.assign(msg);
    ::temp_string.append(_T("\n"));

    if (lv == caption)
    {
        control->SetDefaultStyle(style[info]);
        control->AppendText(::newline_string);

        control->SetDefaultStyle(style[lv]);
        control->AppendText(::temp_string);

        control->SetDefaultStyle(style[spacer]);
        control->AppendText(::newline_string);
    }
    else
    {
        control->SetDefaultStyle(style[lv]);
        control->AppendText(::temp_string);
    }
}

The F function as it was is kind of "semi threadsafe" insofar as all modules that were supposed to use it (mostly compiler, and maybe project mgr) run at most one thread per module. Now the TextCtrlLogger such as being used for the debug log on the other hand is definitely thread-unsafe, but I don't even understand why it uses the global here at all. Appending a newline would make much more sense on a plain normal local variable, no?
"We should forget about small efficiencies, say about 97% of the time: Premature quotation is the root of public humiliation."

Offline oBFusCATed

  • Developer
  • Lives here!
  • *****
  • Posts: 13413
    • Travis build status
Re: F() function is not thread safe?
« Reply #10 on: January 10, 2014, 09:53:43 pm »
But I'm OK if you think that returning by value is a better choice, even if it does create a copy.
As far as I'm concerned (and probably the standard) returning by value is the only option if you're not using a static or global caching variable.

Feel free to choose whatever makes you feel most comfortable.
I don't intend to do anything about this function, I'm just commenting your patch:)
(most of the time I ignore long posts)
[strangers don't send me private messages, I'll ignore them; post a topic in the forum, but first read the rules!]

Offline thomas

  • Administrator
  • Lives here!
  • *****
  • Posts: 3979
Re: F() function is not thread safe?
« Reply #11 on: January 10, 2014, 10:29:38 pm »
Well, the standard is totally fine with returning a const reference to a local variable, since this extends the lifetime of the local to the lifetime of the reference.

But like I said, I'm fine with anything.

I seem to be kind of unable to commit at this time (or using wrong password/account/whatever -- in any case it doesn't work?), so someone else needs to decide what to do and commit :)
"We should forget about small efficiencies, say about 97% of the time: Premature quotation is the root of public humiliation."

Offline oBFusCATed

  • Developer
  • Lives here!
  • *****
  • Posts: 13413
    • Travis build status
Re: F() function is not thread safe?
« Reply #12 on: January 11, 2014, 01:38:20 am »
Well, the standard is totally fine with returning a const reference to a local variable, since this extends the lifetime of the local to the lifetime of the reference.
You are so wrong here....

See an example:
Code
#include <stdio.h>
#include <string>

const std::string & func()
{
std::string temp("value");
return temp;
}

int main()
{
std::string s=func();
return 0;
}

Code
 g++ -Wall -Wextra return_ref.cpp
return_ref.cpp: In function ‘const string& func()’:
return_ref.cpp:6:14: warning: reference to local variable ‘temp’ returned [enabled by default]

http://herbsutter.com/2008/01/01/gotw-88-a-candidate-for-the-most-important-const/
(most of the time I ignore long posts)
[strangers don't send me private messages, I'll ignore them; post a topic in the forum, but first read the rules!]

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5906
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: F() function is not thread safe?
« Reply #13 on: January 11, 2014, 09:05:13 am »

What worries me more than those micro-optimizations is code like this:
Code
void TextCtrlLogger::Append(const wxString& msg, Logger::level lv)
{
    if (!control)
        return;

    ::temp_string.assign(msg);
    ::temp_string.append(_T("\n"));

    if (lv == caption)
    {
        control->SetDefaultStyle(style[info]);
        control->AppendText(::newline_string);

        control->SetDefaultStyle(style[lv]);
        control->AppendText(::temp_string);

        control->SetDefaultStyle(style[spacer]);
        control->AppendText(::newline_string);
    }
    else
    {
        control->SetDefaultStyle(style[lv]);
        control->AppendText(::temp_string);
    }
}

The F function as it was is kind of "semi threadsafe" insofar as all modules that were supposed to use it (mostly compiler, and maybe project mgr) run at most one thread per module. Now the TextCtrlLogger such as being used for the debug log on the other hand is definitely thread-unsafe, but I don't even understand why it uses the global here at all. Appending a newline would make much more sense on a plain normal local variable, no?

The function:
Code
Manager::Get()->GetLogManager()->DebugLog()
Manager::Get()->GetLogManager()->Log()
are totally thread un-safe, so they can only be used in the main GUI thread.

So, if you have debug messages to be sent from a worker thread, you need to implement your own thread safe log function.

BTW: Is it possible to use wxLog functions in wxWidgets, such as wxLogMessage, since they can redirect the messages to a wxTextCtrl, and I see in wx document, they are thread safe.
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5906
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: F() function is not thread safe?
« Reply #14 on: June 10, 2014, 07:50:45 am »
Let me give some conclusion:
1,  F() function can be optimized to becomes thread safe by returning a const wxString (returning a const wxString reference gives build error on my system just as OBF said)

2,  Our DebugLog() and Log() functions are not thread safe, because it will directly operate on the wxTextCtrl.

Here is my solution:
1, I don't intend to fix the F() function, instead I will add some comments around the F(). Reuse the global wxString should be much faster if we only call F() and DebugLog() in the main GUI thread.
2, I will create an alternative SAFE_F() function (which return a const wxString). It is used only in CodeCompletion plugin.
3, CodeCompletion plugin already implements its own thread safe logger functions (those functions can be called in multiply threads) by posting the log events from any threads to the main GUI.

EDIT: I will commits the changes if there is no further objections.
« Last Edit: January 21, 2015, 06:44:38 am by ollydbg »
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.