Author Topic: std::string_view is a good string candicate for Token class in our Lexer  (Read 129 times)

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 4970
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
I once asked a question How to pass token kind with its associated information from lexer to preprocessor, then to parser in the stackoverflow, but unlucky, there is no answers.

Today, I just see that basic_string_view on cpp-reference(it will comes in C++17) may be a good candidate for such kind of string in the Token class.

I see a project named ninja-build/ninja: a small build system with a focus on speed, which use a similar class string_piece.h for it's lexer, which has those comments:
/// StringPiece represents a slice of a string whose memory is managed
/// externally.  It is useful for reducing the number of std::strings
/// we need to allocate.

Another project(by the author of cppcheck) danmar/simplecpp: C++ preprocessor, in it's source file, it said:
Code: [Select]
     * token class.
     * @todo don't use std::string representation - for both memory and performance reasons
    class SIMPLECPP_LIB Token {
        Token(const TokenString &s, const Location &loc) :
            str(string), location(loc), previous(NULL), next(NULL), string(s) {

        Token(const Token &tok) :
            str(string), macro(tok.macro), location(tok.location), previous(NULL), next(NULL), string(tok.str) {

Luckily, there are some back-ported string_view class for c++11, see bitwizeshift/string_view-standalone: A custom implementation of the C++17 'string_view' back-ported to c++11, so may be, let's hope some news will happen.  :)

If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.