This evening when I will have more free time, I will provide a better overview of the parser (wxWidgets project) and post it tomorrow (I do not have Internet at home ).
I have worked a bit more in the parser C++ of the Mini C++ interpreter.
The Mini C++ interpreter supports the following features:
- Parameterized functions with local variables
- Nested scopes
- Recursion
- The if, switch and break statements
- The do-while, while and for loops
- Function parameters of type int and char
- Integer and character constants
- String constants (limited implementation)
- The return statement, both with and without a value
- A handful of standard library functions
- The operators +,-,*,/,%,<,>,<=,>=,==,!=,++,--,unary - and unary +
- Functions returning integers
- /* and // comments
- Console I/O via cin and cout
Class are not supported. The
default statement for the
switch is not supported. The targets of
if,
while,
do and
for must be blocks of code surrounded by beginning and ending braces.
The (expression) parser for C++ is a recursive-descent parsers and not, as many commercial parser are, a table-driven parser. Table-driven parsers are faster, but also harder to implement. A recursive-descent parser is essentially a collection of mutually recursive functions that process an expression.
The parser for C++ is not implemented as class, but as a set of functions.
The most important function of the parser is the
get_token(). The
get_token() function returns tokens from the source code. The function begins by skipping over all white space, including carriage and return line feeds and comments. Then the next token in the program is read (each category is handled separately). For example if the next token in the program is a digit, a number is read; if the next character is a letter, an identifier or keyword is obtained and so on.
The string representation of the token is placed into
token. Once read, the token's type (as enumerated by the
tok_types enumeration) is put into
token_type and, if the token is a keyword, its internal representation (as enumerated by
token_ireps) is assigned to
tok via the
look_up() function.
An interesting function implemented in the parser is the
eval_exp() which is used to evaluate C++ expression, e.g.,
10-3*2.
The
main() function (located in the minicpp.cpp file) is also source of useful information
. The
main() function begins by allocating memory to hold the program being interpreted. The largest program that can be interpreted is specified by the constant
PROG_SIZE (arbitrarily set at 10,000, but can be modified). Next, the program is loaded by calling
load_program(). After the program has been loaded,
main() performs three actions:
- It calls prescan(), the interpreter prescanner
- It readies the interpreter for the call to main() by finding its location in the program
- It executes call(), which begins the execution of the program at the start of main()
The interpreter prescanner perfoms two important tasks:
- All global variables must be found and initialized
- The location of each function defined in the program must be found
I have attached to this post a TestParser.cpp file, which shows how the function
get_token() works. I have also provided a new version of the minicpp.cpp file, because some code has to be commented in order to get the TestParser to work.
Michael
[attachment deleted by admin]