Developer forums (C::B DEVELOPMENT STRICTLY!) > Development

Python lexer

(1/3) > >>

sethjackson:
I have made a python lexer for C::B. I know 0 python, but mosfet asked if scripting languages
and specifically python was going to be supported so I wrote a little sample code and a (most of the) lexer.
The sample code may be wrong (I peeked at the python website). Some python geek will find something
wrong with it probably. :( The forum link I was talking about is here.

http://forums.codeblocks.org/index.php?topic=1980.0

Download the patch from SF.net

http://sourceforge.net/tracker/index.php?func=detail&aid=1407815&group_id=126998&atid=707418

Constructive criticism is always welcome. :)

BTW no offense intended about the python geek part.  :D

Stevo:
Ive just tried your lexer.  Looks good.

My only comments are, can you add "*SConstruct" and "*SConscript" to the filemasks.  These are python files, which are used by the SCons project (www.scons.org (which is a make replacement).  Ive just switched to it from jam, and this lexer syntax highlights these build scripts fine.

Any chance this can get added to the repo as a standard lexer, it looks as good as the others to me, and python is a pretty popular language.

Stevo

Actually, ive attemted to enhance the lexer and sample python file.  They are attached.  unfortunately, my enhancements didnt do what i expected, and im a bit lost as to what to do.  Ive added a bunch of definitions to keywords, for builtins, etc.  ive added all the standard python modules to user and ive added the standard exceptions to documentation (do they can be highlighted different to keywords).  The problem is i dont see these new sets of keywords, when i copy these files into /share/codeblocks/lexers/ if anyone can look at them, and provide any advice id appreciate it.

lexer_python.xml:

--- Code: ---<?xml version="1.0"?>
<!DOCTYPE CodeBlocks_lexer_properties>
<CodeBlocks_lexer_properties>
        <Lexer name="Python"
                index="2"
                filemasks="*.py,*SConstruct,*SConscript">
                <Style name="Default"
                                                index="0"
                                                fg="0,0,0"
                                                bg="255,255,255"
                                                bold="0"
                                                italics="0"
                                                underlined="0"/>
                <Style name="Comment"
                        index="1"
                        fg="160,160,160"/>
                <Style name="Number"
                        index="2"
                        fg="240,0,240"/>
                <Style name="String"
                        index="3"
                        fg="0,0,255"/>
                <Style name="Character"
                        index="4"
                        fg="224,160,0"/>
                <Style name="Keyword"
                        index="5"
                        fg="0,0,160"
                        bold="1"/>
                <Style name="Triple qutoes"
                        index="6"
                        fg="128,0,0"/>
                <Style name="Triple double quotes"
                        index="7"
                        fg="128,0,128"/>
                <Style name="Class name"
                        index="8"
                        fg="0,0,0"/>
                <Style name="Definiton name"
                        index="9"
                        fg="0,160,0"
                        bold="1"/>
                <Style name="Operator"
                        index="10"
                        fg="255,0,0"/>
                <Style name="Identifier"
                        index="11"/>
                <Style name="Comment block"
                        index="12"
                        fg="128,128,255"
                        bold="1"/>
                <Style name="String EOL"
                        index="13"/>
                <Style name="User Keyword"
                        index="14"/>
                <Style name="Decorator"
                        index="15"/>
                <Keywords>
                        <Language index="0"
                                value="and assert break class continue def del elif else except
                                       exec finally for from global if import in is lambda None
                                       not or pass print raise return try while yield

                                       __import__ abs basestring bool callable chr classmethod
                                       cmp compile complex delattr dict dir divmod enumerate
                                       eval execfile file filter float frozenset getattr globals
                                       hasattr hash help hex id input int isinstance issubclass
                                       iter len list locals long map max min object oct open
                                       ord pow property range raw_input reduce reload repr
                                       reversed round set setattr slice sorted staticmethod
                                       str sum super tuple type type unichr unicode vars xrange
                                       zip

                                       apply buffer coerce intern

                                       __dict__ Ellipsis False True NotImplemented
                                       __class__ __bases__ __name__
                                      "/>
                        <User index="1"
                                value="sys gc weakref fpectl atexit types UserDict UserList UserString
                                                 operator inspect traceback linecache pickle cPickle copy_reg
                                                 shelve copy marshal warnings imp zipimport pkgutil modulefinder
                                                 code codeop pprint repr new site user __builtin__ __main__
                                                 __future__

                                                 string re struct difflib fpformat StringIO cStringIO textwrap
                                                 codecs encodings.idna unicodedata stringprep

                                                 pydoc doctest unittest test test.test_support decimal math
                                                 cmath random whrandom bisect collections heapq array sets
                                                 itertools ConfigParser fileinput calendar cmd shlex

                                       os os.path dircache stat statcache statvfs filecmp subprocess
                                       popen2 datetime time sched mutex getpass curses curses.textpad
                                       curses.wrapper curses.ascii curses.panel getopt optparse tempfile
                                       errno glob fnmatch shutil locale gettext logging platform

                                       signal socket select thread threading dummy_thread dummy_threading
                                       Queue mmap anydbm dbhash whichdb bsddb dumbdbm zlib gzip bz2
                                       zipfile tarfile readline rlcompleter

                                       posix pwd grp crypt dl dbm gdbm termios tty pty fcntl pipes
                                       posixfile resource nis syslog commands

                                       hotshot timeit

                                       webbrowser cgi cgitb urllib urllib2 httplib ftplib gopherlib
                                       poplib imaplib nntplib smtplib smtpd telnetlib urlparse
                                       SocketServer BaseHTTPServer SimpleHTTPServer CGIHTTPServer
                                       cookielib Cookie xmlrpclib SimpleXMLRPCServer DocXMLRPCServer
                                       asyncore asynchat

                                       formatter email email.Message email.Parser email.Generator
                                       email.Header email.Charset email.Encoders email.Errors
                                       email.Utils email.Iterators mailcap mailbox mhlib mimetools
                                       mimetypes MimeWriter mimify multifile rfc822 base64 binascii
                                       binhex quopri uu xdrlib netrc robotparser csv

                                       HTMLParser sgmllib htmllib htmlentitydefs xml.parsers.expat
                                       xml.dom xml.dom.minidom xml.dom.pulldom xml.sax
                                       xml.sax.handler xml.sax.saxutils xml.sax.xmlreader xmllib

                                       audioop imageop aifc sunau wave chunk colorsys rgbimg imghdr
                                       sndhdr ossaudiodev

                                       hmac md5 sha

                                       Tkinter Tix ScrolledText turtle

                                       parser symbol token keyword tokenize tabnanny pyclbr
                                       py_compile compileall dis pickletools distutils

                                      "/>
                        <Documentation index="2"
                                value="exception Exception StandardError ArithmeticError
                                       LookupError EnvironmentError AssertionError
                                       AttributeError EOFError FloatingPointError IOError
                                       ImportError IndexError KeyError KeyboardInterrupt
                                       MemoryError NameError NotImplementedError OSError
                                       OverflowError ReferenceError RuntimeError
                                       StopIteration SyntaxError SystemError SystemExit
                                       TypeError UnboundLocalError UnicodeError
                                       UnicodeEncodeError UnicodeDecodeError
                                       UnicodeTranslateError ValueError WindowsError
                                       ZeroDivisionError Warning UserWarning
                                       DeprecationWarning PendingDeprecationWarning
                                       SyntaxWarning RuntimeWarning FutureWarning
                                      "/>
                </Keywords>
                <SampleCode value="lexer_python.sample"/>
        </Lexer>
</CodeBlocks_lexer_properties>

--- End code ---

lexer_python.sample

--- Code: ---# This is a comment
## This is a comment block

>>> "Hello World!"
>>> 'Test'
>>> 2 + 2
>>> '''Triple quotes!'''
>>> """Triple double quotes!"""

month_names = ['Januari', 'Februari', 'Maart',      # These are the
               'April',   'Mei',      'Juni',       # Dutch names
               'Juli',    'Augustus', 'September',  # for the months
               'Oktober', 'November', 'December']   # of the year

class ClassName:
  def perm(l):
  # Compute the list of all permutations of l
    if len(l) <= 1:
      return [l]
    r = []
    for i in range(len(l)):
      s = l[:i] + l[i+1:]
      p = perm(s)
      for x in p:
        r.append(l[i:i+1] + x)
    return r

  def func(self)
    return  'A string\n'

--- End code ---

thomas:

--- Quote from: Stevo on January 19, 2006, 07:10:13 am ---Ive just tried your lexer.  Looks good.
[...]
Ive added a bunch of definitions to keywords, for builtins, etc.  ive added all the standard python modules
--- End quote ---
Are you a python geek then? :)
I am asking because I have no clue regarding python, all I can say is your thingie does a lot of nice colours in the sample code. Looks good to me, so if somebody tells me that this is really good python, I'll commit it... :)


--- Quote ---The problem is i dont see these new sets of keywords, when i copy these files into /share/codeblocks/lexers/ if anyone can look at them, and provide any advice id appreciate it.
--- End quote ---
Have you tried clicking on "Reset Defaults" to force the editor to reload them? Otherwise, changes are not visible.

EDIT:
Found one insignificant typo, it says "qutoe" where it should be "quote". Corrected that in my copy, now just waiting for somebody to tell me if this is "good python".

Stevo:

--- Quote from: thomas on January 19, 2006, 04:24:09 pm ---Are you a python geek then? :)
--- End quote ---

No, im an aspirant Python geek, ive only just started using Python, becuase ive started using SCons.  But so far, im not hating it.


--- Quote from: thomas on January 19, 2006, 04:24:09 pm ---I am asking because I have no clue regarding python, all I can say is your thingie does a lot of nice colours in the sample code. Looks good to me, so if somebody tells me that this is really good python, I'll commit it... :)
--- End quote ---

Im sure the example could probably be beefed up.  I have copied some of the examples from the python tutorial and language reference from www.python.org Below, ive included a beefier example, with code snippets taken from various places.  All of the syntactic elements (i believe) have examples in the example.  I got all of the names for the keywords, user keywords and documentation (which im using for exception keywords) from the documents on the python site.

The decorator points out a minor bug (what i believe is a bug anyway) in the underlying scintilla lexer.  It highlights the comment following the decorators as decorators, i think they should be comments (ie, the decorator should stop where the comment starts) but it isnt a big issue for me.


--- Quote from: thomas on January 19, 2006, 04:24:09 pm ---
--- Quote ---The problem is i dont see these new sets of keywords, when i copy these files into /share/codeblocks/lexers/ if anyone can look at them, and provide any advice id appreciate it.
--- End quote ---
Have you tried clicking on "Reset Defaults" to force the editor to reload them? Otherwise, changes are not visible.
--- End quote ---
That did the trick, thanks.


--- Quote from: thomas on January 19, 2006, 04:24:09 pm ---EDIT:
Found one insignificant typo, it says "qutoe" where it should be "quote". Corrected that in my copy, now just waiting for somebody to tell me if this is "good python".

--- End quote ---

As I said, im not a Python geek, so id be more than happy for anyone to second it, but i feel it is pretty good, based on my research.  Thanks for starting this BTW.  There are still a couple of issues:

1. I think the following colour elements should be renamed:
String -> Double Quote String
Character -> Single Quote String
Triple Quotes -> Triple Single Quoted String
Triple Double Quotes -> Triple Double Quoted String

String and Character are the wrong names (i think),  'aaa' == "aaa" they are both strings, and are interchangeable, so they should both be listed as strings, i think.  The change to the other 2 is just for consistency.

2. Documentation highlighting:
I dont know how the get the words in the documentation area to highlight.  There doesnt seem to be a lexer item for them, so if they cant be independently highlighted, I think they should be incorporated into Keywords.

Anyway, here is the revised example:

--- Code: ---# This is a comment
## This is a comment block

import sys, time, string

month_names = ['Januari', 'Februari', 'Maart',      # These are the
               'April',   'Mei',      'Juni',       # Dutch names
               "Juli",    "Augustus", "September",  # for the months
               "Oktober", "November", "December"]   # of the year

if len(sys.argv)!=2:
    print '''Usage: This is a 'Code::Blocks' Example'''
    print "This String goes to EOL
    sys.exit(0)

class ClassName:
  def perm(l):
  # Compute the list of all permutations of l
    if len(l) <= 1:
      return [l]
    r = []
    for i in range(len(l)):
      s = l[:i] + l[i+1:]
      p = perm(s)
      for x in p:
        r.append(l[i:i+1] + x)
    return r

@classmethod           # This is a decorator
@synchronized(lock)    # And so is this
def func(self):
  try:
    return  """A "Triple Double Quote" String\n"""
  except SystemExit:
    pass


--- End code ---

sethjackson:

--- Quote from: Stevo on January 20, 2006, 12:55:34 am ---1. I think the following colour elements should be renamed:
String -> Double Quote String
Character -> Single Quote String
Triple Quotes -> Triple Single Quoted String
Triple Double Quotes -> Triple Double Quoted String

String and Character are the wrong names (i think),  'aaa' == "aaa" they are both strings, and are interchangeable, so they should both be listed as strings, i think.  The change to the other 2 is just for consistency.

2. Documentation highlighting:
I dont know how the get the words in the documentation area to highlight.  There doesnt seem to be a lexer item for them, so if they cant be independently highlighted, I think they should be incorporated into Keywords.


--- End quote ---

1. Rename them. :)
2. About the documentation keywords.


--- Code: (xml) ---<Documentation index="2"
                  value=""/>

--- End code ---

put all the doc keywords in value="" (sperate each item by a space)

EDIT:

This will help you understand how the lexers work. :)

http://forums.codeblocks.org/index.php?topic=519.0

Navigation

[0] Message Index

[#] Next page

Go to full version