Developer forums (C::B DEVELOPMENT STRICTLY!) > CodeCompletion redesign

wxString support in wxWidgets 3.0 problem?

<< < (4/4)

ollydbg:

--- Quote from: BlueHazzard on October 18, 2013, 04:34:42 am ---In c++ times the pointer-way is the bad way ;) Better would be to use iterators...
but i think wx2.8 has no support for string iterators -.-

--- End quote ---
Yes, I agree.

What I think the better way is:
Use std::string to hold the source file buffers internally in CC plugins, not wxString. Even wxWidgets document suggest this:
http://docs.wxwidgets.org/trunk/classwx_string.html

--- Quote ---String class for passing textual data to or receiving it from wxWidgets.

Note
    While the use of wxString is unavoidable in wxWidgets program, you are encouraged to use the standard string classes std::string or std::wstring in your applications and convert them to and from wxString only when interacting with wxWidgets.

--- End quote ---
Then all the internal source code was encoded in UTF8 (stored in std::string), then I have already created a simple/faster lexer by Quex, see this post Quex lexer grammar, probably can make our tokenizer much faster for details. the generated lexer is all c/c++ code, and it is about three times faster than Flex generated lexer.

Note: ctags internally use char type too.

ollydbg:
Another issue is the string construction. As you know, all token strings are in-fact a sub-string of the source file. (in some special case, the token is replaced by some macro expansion, but we can create an auxiliary source string to hold all the expanded strings).

What a lexer do is to locate the start point and the end point of the lexeme, for example in a source code

--- Code: ---int main ( ) { int a; .....
    ^   $

--- End code ---
Note, when a lexeme is found, the lexer (Quex lexer) know the start position "^", and the end position "$", also it has a Type enum information, in this case, it is an "identifier". It depend on the user to handle this information, so if you have a Token class like below:

--- Code: ---class CCToken
{
    std::string name;
    TokenType   type;
}

--- End code ---
The user should construct the CCToken instance by a memory copy from source code to name member variables, then set the type member variables.

I think a better way is:

--- Code: ---class CCToken
{
    int  source_index;
    int  lexeme_start;
    int  lexeme_length;
    TokenType  type;
}
--- End code ---
There, the first member is the index to the source buffer, then remember the start position and length.

Maybe, we can supply a member function like: "std::string CCToken::ToStdString()", which return a true new std::string. In most cases, I think we don't need to use lexeme_start and lexeme_length, because we only need to know the TokenType. For example there are some TokenTypes like: "keyword_class", "keyword_public"........


Navigation

[0] Message Index

[*] Previous page

Go to full version