Code::Blocks Forums

Developer forums (C::B DEVELOPMENT STRICTLY!) => Development => CodeCompletion redesign => Topic started by: ollydbg on March 10, 2017, 08:09:07 am

Title: Predictive recursive decent parser question about our build-in parser
Post by: ollydbg on March 10, 2017, 08:09:07 am
We have such code in our parser:

    while (m_Tokenizer.NotEOF() && IS_ALIVE)
        wxString token = m_Tokenizer.GetToken();
        if (token == ParserConsts::kw_class)
            if (m_Options.handleClasses)
                SkipToOneOfChars(ParserConsts::semicolonclbrace, true, true);

While, the function "GetToken" just consume the Token from the Token Stream, and advance the Token position.

But I see for a standard implementation, it usually use "PeekToken", and consume the Token later. See here: Recursive descent parser implementation (

Any ideas about the difference?

BTW: I have post a question in parsing - Are the peeked token or current token the same thing in recursive decent parser implementation? - Stack Overflow (
Title: Re: Predictive recursive decent parser question about our build-in parser
Post by: ollydbg on March 11, 2017, 03:48:06 pm
OK, in this clang based code browser: Parser.h source code [clang/include/clang/Parse/Parser.h] - Woboq Code Browser (, I see such code and comments:

class Parser : public CodeCompletionHandler {

  Preprocessor &PP;
  /// Tok - The current token we are peeking ahead.  All parsing methods assume
  /// that this is valid.
  Token Tok;
  // PrevTokLocation - The location of the token we previously
  // consumed. This token is used for diagnostics where we expected to
  // see a token following another token (e.g., the ';' at the end of
  // a statement).
SourceLocation PrevTokLocation;

  /// GetLookAheadToken - This peeks ahead N tokens and returns that token
  /// without consuming any tokens.  LookAhead(0) returns 'Tok', LookAhead(1)
  /// returns the token after Tok, etc.
  /// Note that this differs from the Preprocessor's LookAhead method, because
  /// the Parser always has one token lexed that the preprocessor doesn't.
  const Token &GetLookAheadToken(unsigned N) {
    if (N == 0 || return Tok;
    return PP.LookAhead(N-1);
  /// NextToken - This peeks ahead one token and returns it without
  /// consuming it.
  const Token &NextToken() {
    return PP.LookAhead(0);

Especially the comments for the Parser::GetLookAheadToken() function, and I see that the Tok member variable is very important, and it stands for the "peeked token", this is the cache. This value is the same thing as I can see from the wikipedia's code: Recursive_descent_parser's C implementation (

But in the Preprocessor(see the PP variable), the peeked token is the one in the Token queue, but this is the one we returned as next token (see const Token &NextToken()  in the above code snippet).

Now, back to our build-in parser, I think we may need a cache variable which store the value from the "m_Tokenizer.GetToken()", while we call "m_Tokenizer.PeekToken()", this actually means the "Token &NextToken()" as in clang's Parser implementation.

EDIT: this tutorial also shows the same method(cached Tok), see here: 2. Kaleidoscope: Implementing a Parser and AST — LLVM 5 documentation (