Author Topic: wxString support in wxWidgets 3.0 problem? (Read 32526 times)

ollydbg · « **Reply #15 on:** October 18, 2013, 04:49:21 am »

Quote from: BlueHazzard on October 18, 2013, 04:34:42 am

In c++ times the pointer-way is the bad way Better would be to use iterators...
but i think wx2.8 has no support for string iterators -.-

Yes, I agree.

What I think the better way is:
Use std::string to hold the source file buffers internally in CC plugins, not wxString. Even wxWidgets document suggest this:
http://docs.wxwidgets.org/trunk/classwx_string.html

Quote

String class for passing textual data to or receiving it from wxWidgets.

Note
While the use of wxString is unavoidable in wxWidgets program, you are encouraged to use the standard string classes std::string or std::wstring in your applications and convert them to and from wxString only when interacting with wxWidgets.

Then all the internal source code was encoded in UTF8 (stored in std::string), then I have already created a simple/faster lexer by Quex, see this post Quex lexer grammar, probably can make our tokenizer much faster for details. the generated lexer is all c/c++ code, and it is about three times faster than Flex generated lexer.

Note: ctags internally use char type too.

ollydbg · « **Reply #16 on:** October 18, 2013, 05:13:36 am »

Another issue is the string construction. As you know, all token strings are in-fact a sub-string of the source file. (in some special case, the token is replaced by some macro expansion, but we can create an auxiliary source string to hold all the expanded strings).

What a lexer do is to locate the start point and the end point of the lexeme, for example in a source code

Code

int main ( ) { int a; .....
    ^   $

Note, when a lexeme is found, the lexer (Quex lexer) know the start position "^", and the end position "$", also it has a Type enum information, in this case, it is an "identifier". It depend on the user to handle this information, so if you have a Token class like below:

Code

class CCToken
{
    std::string name;
    TokenType   type;
}

The user should construct the CCToken instance by a memory copy from source code to name member variables, then set the type member variables.

I think a better way is:

Code

class CCToken
{
    int  source_index;
    int  lexeme_start;
    int  lexeme_length;
    TokenType  type;
}

There, the first member is the index to the source buffer, then remember the start position and length.

Maybe, we can supply a member function like: "std::string CCToken::ToStdString()", which return a true new std::string. In most cases, I think we don't need to use lexeme_start and lexeme_length, because we only need to know the TokenType. For example there are some TokenTypes like: "keyword_class", "keyword_public"........

Code::Blocks Forums

News:

Author Topic: wxString support in wxWidgets 3.0 problem? (Read 32526 times)

ollydbg

Re: wxString support in wxWidgets 3.0 problem?

ollydbg

Re: wxString support in wxWidgets 3.0 problem?