Author Topic: parser library: ELL is quite simple and useful  (Read 15420 times)

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5910
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
parser library: ELL is quite simple and useful
« on: April 29, 2010, 04:59:44 am »
Hi, I have just find a Parser library:
ell - Project Hosting on Google Code
It is a Embedded LL library. A very light C++ library to embbed EBFN grammars in your code.

You can download the source code in Downloads which contains a simple Calculator sample and a more complex XMLparser sample.

All the grammars can be coded in C++ style. :D

I have do a simple test( I try to analyze a variable definition or a function declaration). I just modify the Calculator sample. here is the my own defined grammar.

Code
#include <stack>
#include <ell/Grammar.h>
#include <ell/Parser.h>

struct Calc : ell::Parser<char>, ell::Grammar<char>
{
    Calc()
      : ell::Parser<char>(& root, & blank)
    {
        flags.look_ahead = false;

        state =  (+ident) >>(  ch(';') [& Calc::Variable]
                              |ch('(') >> argument >> ch(')')>> ch(';')[& Calc::Function] );
        argument = (+ident) >> *( ch(',') >> (+ident));
        root  = state >> ell::Grammar<char>::end;

        ELL_NAME_RULE(state);
        ELL_NAME_RULE(root);
        ELL_NAME_RULE(argument);
    }

    double eval(const char * expr)
    {
        parse(expr);
        return 1.0;
    }

protected:
    void Variable()
    {
        std::cout<< "This is a variable" << std::endl ;
        //return 1.0;
    }

    void Function()
    {
        std::cout<< "This is a Function" << std::endl ;
        //return 1.0;
    }
    ell::Rule<char> state, root, argument;

};

You can see, I have defined three production Rules:

Code
statement rule is some statement like: AAA BBB CCC;
Function declaration rule is some statement like: AAA BBB CCC (argument);
argument rule is some statement like(comma separated tokens) :  DDD, EEE FFF, GGG

These grammars were used in the Current CC's parser( As CC use a hand written, heuristic parser  :D)

Here is a test result:

Code
> aaa bbb ccc;
This is a variable
> aaa bbb ccc ddd(int i)
1: before end: expecting [;]

> aaa bbb ccc ddd(int i);
This is a Function
>


So, you can see, it works quite good!

Though I'm not sure ELL can generate more complex parser, but I think it at least can parse "one line statement", which is the base for CC to do "suggestion list".

If you have any interests and comments, please let me know. :D

BTW: Codelite use the same mechanism (Yacc and Lex) to analyze the "one line statement".
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5910
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: parser library: ELL is quite simple and useful
« Reply #1 on: April 29, 2010, 05:18:36 am »
There is a modified grammar: support simple class declaration:
Code
struct Calc : ell::Parser<char>, ell::Grammar<char>
{
    Calc()
      : ell::Parser<char>(& root, & blank)
    {
        flags.look_ahead = false;

        state = ( str("class") >> ident >> block >> ch(';') [& Calc::ClassDecl])
                | (+ident) >>(  ch(';') [& Calc::Variable]
                              |ch('(') >> argument >> ch(')')>> ch(';')[& Calc::Function] );
        argument = (+ident) >> *( ch(',') >> (+ident));
        block = ch('{')>> (*state) >> ch('}');
        root  = state >> ell::Grammar<char>::end;

        ELL_NAME_RULE(state);
        ELL_NAME_RULE(root);
        ELL_NAME_RULE(argument);
        ELL_NAME_RULE(block);
    }

    double eval(const char * expr)
    {
        parse(expr);
        return 1.0;
    }

protected:
    void Variable()
    {
        std::cout<< "This is a variable" << std::endl ;
        //return 1.0;
    }

    void Function()
    {
        std::cout<< "This is a Function" << std::endl ;
        //return 1.0;
    }
    void ClassDecl()
    {
        std::cout<< "This is a Class Decl" << std::endl ;
    }

    ell::Rule<char> state, root, argument, block;

};

And here is the test command output:
Code
> class A;
1: before ";": expecting block

> class A {int a;};
This is a variable
This is a Class Decl
> int aaa;
This is a variable
> class A {int a; int b; int c};
This is a variable
This is a variable
1: before "};": expecting [;] or ([(] argument [)] [;])

> class A {int a; int b; int c;};
This is a variable
This is a variable
This is a variable
This is a Class Decl
>


 :D
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline oBFusCATed

  • Developer
  • Lives here!
  • *****
  • Posts: 13413
    • Travis build status
Re: parser library: ELL is quite simple and useful
« Reply #2 on: April 29, 2010, 01:12:18 pm »
If you need a C/C++ parsing library look in the clang's sources.
It's parser library should be the best opensource for c and c++ (one day)...
(most of the time I ignore long posts)
[strangers don't send me private messages, I'll ignore them; post a topic in the forum, but first read the rules!]

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5910
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: parser library: ELL is quite simple and useful
« Reply #3 on: April 29, 2010, 02:20:39 pm »
If you need a C/C++ parsing library look in the clang's sources.
It's parser library should be the best opensource for c and c++ (one day)...
Thanks for the hint. I just remembered that we have discussed clang some months ago. Eran (the author of Codelite IDE) said that Clang is not fully support C++. see here Re: modify codecompletion plugin to macro parser.

So, what I concern is a "fast and heuristic parser", For example, if the parser see some statement like below:

Code
aaa bbb ccc;

It will just regard "ccc" as a variable name.

So, we don't need a strict type checking system.
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline oBFusCATed

  • Developer
  • Lives here!
  • *****
  • Posts: 13413
    • Travis build status
Re: parser library: ELL is quite simple and useful
« Reply #4 on: April 29, 2010, 02:40:25 pm »
Thanks for the hint. I just remembered that we have discussed clang some months ago. Eran (the author of Codelite IDE) said that Clang is not fully support C++. see here Re: modify codecompletion plugin to macro parser.
Yes, it doesn't have full support, but there is great progress...
See here: http://clang.llvm.org/cxx_status.html

So, what I concern is a "fast and heuristic parser", For example, if the parser see some statement like below:

Code
aaa bbb ccc;

It will just regard "ccc" as a variable name.

So, we don't need a strict type checking system.
clang has a modular structure -> it is split in libraries
Also clang is meant to be used by IDEs ... for codecompletion and refactoring... xcode use it(not 100% sure).

(most of the time I ignore long posts)
[strangers don't send me private messages, I'll ignore them; post a topic in the forum, but first read the rules!]

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5910
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: parser library: ELL is quite simple and useful
« Reply #5 on: April 29, 2010, 03:43:24 pm »
Thanks for the hint. I just remembered that we have discussed clang some months ago. Eran (the author of Codelite IDE) said that Clang is not fully support C++. see here Re: modify codecompletion plugin to macro parser.
Yes, it doesn't have full support, but there is great progress...
See here: http://clang.llvm.org/cxx_status.html


Ok, I have see that webpage. It seems the C++ grammar is quite complex, and for Clang there are still a lot of thing to do. I just learned that Clang is a "full c/c++ compiler front end", so, I'm not sure the parsing performance is still good enough.(As the official clang site said is is 3X faster than GCC/G++).

Also, For parser a C++ source, Clang still need a separate "preprocessor" stage. (Macro expansion, header file including). So, this is still not the best parser for an IDE. Because in my mind, a good parser for CodeCompletion plugin should have "simultaneously preprocessor handling when parsing".

If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline oBFusCATed

  • Developer
  • Lives here!
  • *****
  • Posts: 13413
    • Travis build status
Re: parser library: ELL is quite simple and useful
« Reply #6 on: April 29, 2010, 08:28:33 pm »
Ok, I have see that webpage. It seems the C++ grammar is quite complex, and for Clang there are still a lot of thing to do. I just learned that Clang is a "full c/c++ compiler front end", so, I'm not sure the parsing performance is still good enough.(As the official clang site said is is 3X faster than GCC/G++).
The only way to tell if it is fast enough is to test it...

Also, For parser a C++ source, Clang still need a separate "preprocessor" stage. (Macro expansion, header file including). So, this is still not the best parser for an IDE. Because in my mind, a good parser for CodeCompletion plugin should have "simultaneously preprocessor handling when parsing".
Why?
Also you can contact the clang guys and they can improve things, here and there... or explain to you why they've done something the way it is done.
(most of the time I ignore long posts)
[strangers don't send me private messages, I'll ignore them; post a topic in the forum, but first read the rules!]

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5910
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: parser library: ELL is quite simple and useful
« Reply #7 on: April 30, 2010, 02:56:47 am »
Also, For parser a C++ source, Clang still need a separate "preprocessor" stage. (Macro expansion, header file including). So, this is still not the best parser for an IDE. Because in my mind, a good parser for CodeCompletion plugin should have "simultaneously preprocessor handling when parsing".
Why?
Also you can contact the clang guys and they can improve things, here and there... or explain to you why they've done something the way it is done.

aha, yesterday, I have contact the author of ELL library. We have a discussion about parsers for CodeCompletion. He also mentioned using clang to parse the C++ source code.

The author of Ell library said
Quote
Yes, or perhaps have a look at LLVM C++ frontend. But, the problem of parsing for an IDE is suitable different than the one of parsing for a compiler, because unparsing must be non-intrusive. The main problem at parsing C/C++ is the presence of C preprocessor. I'm convinced that parsing the result of preprocessing is not the right approach for code-completion or code-refactoring tools.

To work around this problem, last year I plan to implement a C/C++ event-oriented parser (using libELL) whose grammar would embed the grammar of C preprocessor. Maybe this is what you need, and we could work together on it.

These days Loaden and I was discussing adding the "handling conditional preprocessor" in the CC plugin. Even though it it not a FULL preprocessor,  it can still solve a lot of problems. I think the testing stage should begin after the "stable release".

Also, the ELL library is a light and fast compare to the Boost spirit library. So, if we use ELL, we don't need to let CB depend on Boost libraries.
« Last Edit: April 30, 2010, 09:49:56 am by ollydbg »
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5910
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: parser library: ELL is quite simple and useful
« Reply #8 on: October 25, 2010, 07:44:39 am »
reminder:
seems all C++ stuff in the "parser" stage was finished.

see :

http://clang.llvm.org/cxx_status.html

I'm not sure how to use Clang library in MinGW... I need at least to build Clang under msys/mingw.
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline oBFusCATed

  • Developer
  • Lives here!
  • *****
  • Posts: 13413
    • Travis build status
Re: parser library: ELL is quite simple and useful
« Reply #9 on: October 25, 2010, 10:17:03 am »
Yes you need to do that, at least :)

You can install some linux that has it as a package :)
(most of the time I ignore long posts)
[strangers don't send me private messages, I'll ignore them; post a topic in the forum, but first read the rules!]

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5910
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: parser library: ELL is quite simple and useful
« Reply #10 on: November 02, 2010, 04:52:12 am »
I just find another similar parser, it is a recursive descent parser with a lot of C++ grammar, written with the parser expression grammar, just like boost::sprite.
(ELL use the similar parser expression grammar either).

see:
http://42ndart.org/scalpel/

If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5910
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: parser library: ELL is quite simple and useful
« Reply #11 on: November 02, 2010, 05:10:33 am »
by the way, I just find an old post of Tdragon about choosing a parser framework
see here
http://www.gamedev.net/community/forums/topic.asp?topic_id=349261
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.