Author Topic: vector<int> is OK, but string or wstring no-work.  (Read 91491 times)

Offline MortenMacFly

  • Administrator
  • Lives here!
  • *****
  • Posts: 9694
Re: vector<int> is OK, but string or wstring no-work.
« Reply #15 on: January 06, 2010, 05:03:38 pm »
But for the long run, I think add a state is convient :D
I know, that's why I applied the patch. ;-) However, there seems to be something wrong with the new implementation where you surely can help... :P
Compiler logging: Settings->Compiler & Debugger->tab "Other"->Compiler logging="Full command line"
C::B Manual: https://www.codeblocks.org/docs/main_codeblocks_en.html
C::B FAQ: https://wiki.codeblocks.org/index.php?title=FAQ

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5915
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: vector<int> is OK, but string or wstring no-work.
« Reply #16 on: January 07, 2010, 01:13:12 pm »
I'm confused about your discussion. Does this bug comes from the "real-time CC" or the patch of my modification of Tokenizer??? :?
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline MortenMacFly

  • Administrator
  • Lives here!
  • *****
  • Posts: 9694
Re: vector<int> is OK, but string or wstring no-work.
« Reply #17 on: January 07, 2010, 01:22:41 pm »
Does this bug comes from the "real-time CC" or the patch of my modification of Tokenizer??? :?
From the new Tokenizer states. Have another look at my previous post. The two examples in the code do obviously the opposite of what was done previously. Is this intended behaviour of yours?

Meaning, shouldn't we apply this patch:
Code
Index: src/plugins/codecompletion/parser/parserthread.cpp
===================================================================
--- src/plugins/codecompletion/parser/parserthread.cpp (revision 6058)
+++ src/plugins/codecompletion/parser/parserthread.cpp (working copy)
@@ -403,7 +403,7 @@
     // need to reset tokenizer's behavior
     // don't forget to reset that if you add any early exit condition!
     TokenizerState oldState = m_Tokenizer.GetState();
-    m_Tokenizer.SetState(tsSkipUnWanted);
+    m_Tokenizer.SetState(tsSkipNone);
 
     m_Str.Clear();
     m_LastToken.Clear();
@@ -1385,7 +1385,7 @@
         wxString next = m_Tokenizer.PeekToken(); // named namespace
         if (next==ParserConsts::opbrace)
         {
-            m_Tokenizer.SetState(tsSkipNone);
+            m_Tokenizer.SetState(tsSkipUnWanted);
 
             // use the existing copy (if any)
             Token* newToken = TokenExists(ns, m_pLastParent, tkNamespace);

However, you changes quite a lot related to skipping so I might be wrong...
Compiler logging: Settings->Compiler & Debugger->tab "Other"->Compiler logging="Full command line"
C::B Manual: https://www.codeblocks.org/docs/main_codeblocks_en.html
C::B FAQ: https://wiki.codeblocks.org/index.php?title=FAQ

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5915
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: vector<int> is OK, but string or wstring no-work.
« Reply #18 on: January 07, 2010, 02:08:26 pm »
No, the first place when entering the DoParse function. we should set the state to "

Code
    // need to reset tokenizer's behavior
    // don't forget to reset that if you add any early exit condition!
    TokenizerState oldState = m_Tokenizer.GetState();
    m_Tokenizer.SetState(tsSkipUnWanted);
    m_Str.Clear();
    ....

This is the default setting.
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline MortenMacFly

  • Administrator
  • Lives here!
  • *****
  • Posts: 9694
Re: vector<int> is OK, but string or wstring no-work.
« Reply #19 on: January 07, 2010, 02:35:01 pm »
No, the first place when entering the DoParse function. we should set the state to "
Could you explain '' a little more? Set to what exactly? And what's with the second one?
Compiler logging: Settings->Compiler & Debugger->tab "Other"->Compiler logging="Full command line"
C::B Manual: https://www.codeblocks.org/docs/main_codeblocks_en.html
C::B FAQ: https://wiki.codeblocks.org/index.php?title=FAQ

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5915
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: vector<int> is OK, but string or wstring no-work.
« Reply #20 on: January 07, 2010, 02:55:58 pm »
No, the first place when entering the DoParse function. we should set the state to "
Could you explain '' a little more? Set to what exactly? And what's with the second one?

OK, let me explain the old way( before the patch of Tokenizer state related code).
There is a variable named: m_SkipUnwantedTokens(true), you can see this from the constructor of Tokenizer, so, by default, this means in normal situation, we should let the Tokenizer skip some words. In very merely case, this variable would set to "true" to let the Tokenizer never skip anything.

So, the new way ( after applying my patch), I would set the state to "skipUnWanted", so this is the same behavour like the old one.


For the second place:
I'm sorry, that's my mistake, you are right!

« Last Edit: January 07, 2010, 03:11:53 pm by ollydbg »
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline MortenMacFly

  • Administrator
  • Lives here!
  • *****
  • Posts: 9694
Re: vector<int> is OK, but string or wstring no-work.
« Reply #21 on: January 07, 2010, 03:05:07 pm »
OK, let me explain the old way( before the patch of Tokenizer state related code). [...]
Thanks, that made things clear to me.

So finally, what happened to:
Code
m_Tokenizer.SetOperatorState(true);
This is not applied anymore if I got the new implementation right. Look again:
For example (void ParserThread::DoParse()):
Before:
Code
        else if (token==ParserConsts::kw_operator)
        {
            bool oldState = m_Tokenizer.IsSkippingUnwantedTokens();
            m_Tokenizer.SetSkipUnwantedTokens(false);
            m_Tokenizer.SetOperatorState(true);
After:
Code
            TokenizerState oldState = m_Tokenizer.GetState();
            m_Tokenizer.SetState(tsSkipNone);
Any comments on that...?!
Compiler logging: Settings->Compiler & Debugger->tab "Other"->Compiler logging="Full command line"
C::B Manual: https://www.codeblocks.org/docs/main_codeblocks_en.html
C::B FAQ: https://wiki.codeblocks.org/index.php?title=FAQ

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5915
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: vector<int> is OK, but string or wstring no-work.
« Reply #22 on: January 07, 2010, 03:18:14 pm »
So finally, what happened to:
Code
m_Tokenizer.SetOperatorState(true);
This is not applied anymore if I got the new implementation right. Look again:
For example (void ParserThread::DoParse()):
Before:
Code
        else if (token==ParserConsts::kw_operator)
        {
            bool oldState = m_Tokenizer.IsSkippingUnwantedTokens();
            m_Tokenizer.SetSkipUnwantedTokens(false);
            m_Tokenizer.SetOperatorState(true);
After:
Code
            TokenizerState oldState = m_Tokenizer.GetState();
            m_Tokenizer.SetState(tsSkipNone);
Any comments on that...?!


I personally don't think we need another variable to specify we are after the operator statement.

for example:

Code
void AAA::operator + (XXXX)
or
void AAA::operator = (XXXX)

In such situations, the word after "operator" should return from GetToken() function. ( They won't be skipped ). So, I think once we meet a "operator" word, we need to set the TokenizerState to "tsSkipNone".

Is that OK?  :D
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5915
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: vector<int> is OK, but string or wstring no-work.
« Reply #23 on: January 11, 2010, 03:12:16 am »
Test report:
rev 6067 trunk windows.
CodeCompletion testing workspace:
function_args.cpp can't full pass.
Code
//    i_integer = from.i_;
//    f_float   = from.f_;
stl.cpp can't fully pass
Code
  std::string ss;
  ss.
stl_namespace.cpp can't fully pass.
Code
  string s;
  s.
structs_typedefs.cpp can't fully pass.
Code
    std::string ss;
    my_string   ms;
    ss.

I'm try to find the bug, but these days I'm a little busy :?
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5915
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: vector<int> is OK, but string or wstring no-work.
« Reply #24 on: January 11, 2010, 08:41:02 am »
Ok, have a further test, I'm fully agree with blueshake's No ancestor's in the current string Token, so, the "string" have no ancestors, that's the reason why we don't have string members list auto completion.
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline MortenMacFly

  • Administrator
  • Lives here!
  • *****
  • Posts: 9694
Re: vector<int> is OK, but string or wstring no-work.
« Reply #25 on: January 11, 2010, 08:49:24 am »
Ok, have a further test, [...]
Just for the record: I've implemented some more debugging facilities with the last commit of mine. You can now save the tokens tree to an ASCii file from  the CC debug window.
Compiler logging: Settings->Compiler & Debugger->tab "Other"->Compiler logging="Full command line"
C::B Manual: https://www.codeblocks.org/docs/main_codeblocks_en.html
C::B FAQ: https://wiki.codeblocks.org/index.php?title=FAQ

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5915
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: vector<int> is OK, but string or wstring no-work.
« Reply #26 on: January 11, 2010, 08:57:21 am »
Ok, have a further test, [...]
Just for the record: I've implemented some more debugging facilities with the last commit of mine. You can now save the tokens tree to an ASCii file from  the CC debug window.

Yes, I have noticed that, it is a great improvement!
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline MortenMacFly

  • Administrator
  • Lives here!
  • *****
  • Posts: 9694
Re: vector<int> is OK, but string or wstring no-work.
« Reply #27 on: January 11, 2010, 12:22:06 pm »
Ok, have a further test, I'm fully agree with blueshake's No ancestor's in the current string Token, so, the "string" have no ancestors, that's the reason why we don't have string members list auto completion.
I wonder if the modification you did in void ParserThread::ReadClsNames(wxString& ancestor) are 100% correct. NOtice this code snippet:
Before:
Code
        else if (   wxIsalpha(current.GetChar(0))
                 && (   (m_Tokenizer.PeekToken() == ParserConsts::semicolon)
                     || (m_Tokenizer.PeekToken() == ParserConsts::comma)) )
        {
            TRACE(_T("ReadClsNames() : Adding variable '%s' as '%s' to '%s'"),
                  current.wx_str(),
                  m_Str.wx_str(),
                  (m_pLastParent ? m_pLastParent->m_Name.wx_str():_T("<no-parent>")));

            Token* newToken = DoAddToken(tkTypedef, current, m_Tokenizer.GetLineNumber());
            if (!newToken)
                break;
            else
            {
                wxString tempAncestor       = ancestor;
                newToken->m_AncestorsString = tempAncestor;
                newToken->m_ActualType      = tempAncestor;
                newToken->m_Type            = tempAncestor;
            }
        }
After:
Code
        else if (   wxIsalpha(current.GetChar(0))
                 && (   (m_Tokenizer.PeekToken() == ParserConsts::semicolon)
                     || (m_Tokenizer.PeekToken() == ParserConsts::comma)) )
        {
            TRACE(_T("ReadClsNames() : Adding variable '%s' as '%s' to '%s'"),
                  current.wx_str(),
                  m_Str.wx_str(),
                  (m_pLastParent ? m_pLastParent->m_Name.wx_str():_T("<no-parent>")));

            m_Str.clear();
            wxString tempAncestor = ancestor;
            m_Str = tempAncestor;
            Token* newToken = DoAddToken(tkTypedef, current, m_Tokenizer.GetLineNumber());
            if (!newToken)
                break;
            else
            {
                newToken->m_AncestorsString = tempAncestor;
                //newToken->m_ActualType      = tempAncestor;
                //if (m_IsPointer)
                //{
                //    newToken->m_Type = tempAncestor + _T("*");
                //}
                //else
                //newToken->m_Type            = tempAncestor;
            }
        }
You are not setting the m_Type etc. This is commented out. Is this intended (and if so: Why)?
Compiler logging: Settings->Compiler & Debugger->tab "Other"->Compiler logging="Full command line"
C::B Manual: https://www.codeblocks.org/docs/main_codeblocks_en.html
C::B FAQ: https://wiki.codeblocks.org/index.php?title=FAQ

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5915
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: vector<int> is OK, but string or wstring no-work.
« Reply #28 on: January 11, 2010, 01:49:59 pm »
I just do a test, that, I create a simple cpp file from basci_string.

It seems the parserThread failed in parsing these statement:

Code
  template<typename _CharT, typename _Traits, typename _Alloc>
    inline basic_string<_CharT, _Traits, _Alloc>::
    basic_string()
#ifndef _GLIBCXX_FULLY_DYNAMIC_STRING
    : _M_dataplus(_S_empty_rep()._M_refdata(), _Alloc()) { }
#else
    : _M_dataplus(_S_construct(size_type(), _CharT(), _Alloc()), _Alloc()) { }
#endif

  // operator+
  /**
   *  @brief  Concatenate two strings.
   *  @param lhs  First string.
   *  @param rhs  Last string.
   *  @return  New string with value of @a lhs followed by @a rhs.
   */
  template<typename _CharT, typename _Traits, typename _Alloc>
    basic_string<_CharT, _Traits, _Alloc>
    operator+(const basic_string<_CharT, _Traits, _Alloc>& __lhs,
     const basic_string<_CharT, _Traits, _Alloc>& __rhs)
    {
      basic_string<_CharT, _Traits, _Alloc> __str(__lhs);
      __str.append(__rhs);
      return __str;
    }

because when I read the TRACE Debug output, these code were failed.
You can just copy the testString.txt(see the attachment of this post)'s content to a empty project, and set the
#define PARSERTHREAD_DEBUG_OUTPUT 1

Then view the debug log output.

Note: the testString.txt is just a modified source from STL header file of basic_string.
« Last Edit: January 12, 2010, 02:55:27 am by ollydbg »
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5915
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: vector<int> is OK, but string or wstring no-work.
« Reply #29 on: January 11, 2010, 02:04:21 pm »
You can see the last sentence of the debug log.
Code
DoAddToken() : Found token (parent).
DoAddToken() : Created token='compare', file_idx=1, line=2016
GetActualTokenType() : Searching within m_Str='int'
GetActualTokenType() : Compensated m_Str='int'
GetActualTokenType() : Found 'int'
DoAddToken() : Prepending ''
DoAddToken() : Added/updated token 'compare' (172), type 'int', actual 'int'. Parent is basic_string (1)
DoParse() : Loop:m_Str='', token=';'
DoParse() : Loop:m_Str='', token='int'
DoParse() : Loop:m_Str='int ', token='compare'
HandleFunction() : Adding function 'compare': m_Str='int '
HandleFunction() : name='compare', args='(size_type __pos, size_type __n1, const _CharT* __s, size_type __n2)', peek='const'
HandleFunction() : !(Ctor/Dtor) 'compare', m_Str='int ', localParent='<none>'
HandleFunction() : Adding function 'compare', ': m_Str='int ', enc_ns='nil'.
HandleFunction() : Add token name='compare', args='(size_type __pos, size_type __n1, const _CharT* __s, size_type __n2)', return type='int '
GetStrippedArgs() : args='(size_type __pos, size_type __n1, const _CharT* __s, size_type __n2)'.
GetStrippedArgs() : stripped_args='(size_type,size_type,const _CharT*,size_type)'.
DoAddToken() : Found token (parent).
DoAddToken() : Created token='compare', file_idx=1, line=2042
GetActualTokenType() : Searching within m_Str='int'
GetActualTokenType() : Compensated m_Str='int'
GetActualTokenType() : Found 'int'
DoAddToken() : Prepending ''
DoAddToken() : Added/updated token 'compare' (173), type 'int', actual 'int'. Parent is basic_string (1)
DoParse() : Loop:m_Str='', token=';'
DoParse() : Loop:m_Str='', token='}'
DoParse() : Loop:m_Str='', token='template'
DoParse() : template argument='<typename _CharT, typename _Traits, typename _Alloc>', token ='inline'
DoParse() : Loop:m_Str='', token='#'
HandlePreprocessorBlocks() : Saving nesting level: 1
HandlePreprocessorBlocks() : Restoring nesting level: 1 (was 1)
DoParse() : Loop:m_Str='', token='template'
DoParse() : template argument='<typename _CharT, typename _Traits, typename _Alloc>', token ='basic_string'
DoParse() : Loop:m_Str='', token='template'
DoParse() : template argument='<typename _CharT, typename _Traits, typename _Alloc>', token ='basic_string'
DoParse() : Loop:m_Str='', token='template'
DoParse() : template argument='<typename _CharT, typename _Traits, typename _Alloc>', token ='basic_string'
DoParse() : Loop:m_Str='', token='template'
DoParse() : template argument='<typename _CharT, typename _Traits, typename _Alloc>', token ='inline'
DoParse() : Loop:m_Str='', token='template'
DoParse() : template argument='<typename _CharT, typename _Traits, typename _Alloc>', token ='inline'
DoParse() : Loop:m_Str='', token='template'
DoParse() : template argument='<typename _CharT, typename _Traits, typename _Alloc>', token ='inline'
DoParse() : Loop:m_Str='', token='}'
DoParse() : Loop:m_Str='', token='template'
DoParse() : template argument='<typename _CharT, typename _Traits, typename _Alloc>', token ='inline'
Parsing stage done (1 total parsed files, 174 tokens in 0 minute(s), 8.377 seconds).
Updating class browser...
Class browser updated.
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.