Author Topic: Pseudo semantic highlighting  (Read 36232 times)

Offline Alpha

  • Developer
  • Lives here!
  • *****
  • Posts: 1513
Pseudo semantic highlighting
« on: January 16, 2013, 03:25:21 am »
CC already has a fairly decent idea of the what is going on in the source, so I decided to make CC talk to Scintilla.  This implementation is fairly crude, but I would appreciate feedback from anyone who would like to see a little more color in their code.
Code
Index: src/plugins/codecompletion/codecompletion.cpp
===================================================================
--- src/plugins/codecompletion/codecompletion.cpp (revision 8788)
+++ src/plugins/codecompletion/codecompletion.cpp (working copy)
@@ -3641,4 +3641,39 @@
     TRACE(_T("CodeCompletion::OnEditorActivatedTimer: Starting m_TimerToolbar."));
     m_TimerToolbar.Start(TOOLBAR_REFRESH_DELAY, wxTIMER_ONE_SHOT);
     TRACE(_T("OnEditorActivatedTimer() : Current activated file is %s"), curFile.wx_str());
+
+    cbEditor* ed = Manager::Get()->GetEditorManager()->GetBuiltinEditor(editor);
+    if (!ed || ed->GetControl()->GetLexer() != wxSCI_LEX_CPP)
+        return;
+    TokenIdxSet result;
+    m_NativeParser.GetParser().FindTokensInFile(curFile, result, tkAnyContainer | tkAnyFunction);
+    TokenTree* tree = m_NativeParser.GetParser().GetTokenTree();
+    wxArrayString varList;
+    for (TokenIdxSet::const_iterator it = result.begin(); it != result.end(); ++it)
+    {
+        Token* token = tree->at(*it);
+        if (!token)
+            continue;
+        if (token->m_TokenKind & tkAnyFunction)
+        {
+            if (token->m_ParentIndex == wxNOT_FOUND)
+                continue;
+            else
+                token = tree->at(token->m_ParentIndex);
+        }
+        if (token && token->HasChildren())
+        {
+            for (TokenIdxSet::const_iterator chIt = token->m_Children.begin();
+                 chIt != token->m_Children.end(); ++chIt)
+            {
+                 const Token* chToken = tree->at(*chIt);
+                 if (   chToken && chToken->m_TokenKind == tkVariable
+                     && varList.Index(chToken->m_Name) == wxNOT_FOUND )
+                {
+                    varList.Add(chToken->m_Name);
+                }
+            }
+        }
+    }
+    ed->GetControl()->SetKeyWords(3, GetStringFromArray(varList, wxT(" "), false));
 }

Offline daniloz

  • Regular
  • ***
  • Posts: 268
Re: Pseudo semantic highlighting
« Reply #1 on: January 16, 2013, 08:13:20 am »
@Alpha: your patch didn't apply on my working copy, I had to it "by hand". I'm not sure if I have some differences or you have, haven't got the time to check...

However, I like to new colors, thx!  ;D

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5905
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: Pseudo semantic highlighting
« Reply #2 on: January 16, 2013, 11:03:31 am »
CC already has a fairly decent idea of the what is going on in the source, so I decided to make CC talk to Scintilla.  This implementation is fairly crude, but I would appreciate feedback from anyone who would like to see a little more color in their code.
I like such feature, thanks.
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline MortenMacFly

  • Administrator
  • Lives here!
  • *****
  • Posts: 9694
Re: Pseudo semantic highlighting
« Reply #3 on: January 16, 2013, 04:25:57 pm »
I like such feature, thanks.
Yeah, its nice. It also "highlights" where work needs to be done. For example, add a new member variable to the class in a header  file, then create an inline method to use this member variable (i.e. a getter-method). This variable will be the only one not highlighted like the others until CC scans this file again... :D
Compiler logging: Settings->Compiler & Debugger->tab "Other"->Compiler logging="Full command line"
C::B Manual: https://www.codeblocks.org/docs/main_codeblocks_en.html
C::B FAQ: https://wiki.codeblocks.org/index.php?title=FAQ

Offline oBFusCATed

  • Developer
  • Lives here!
  • *****
  • Posts: 13413
    • Travis build status
Re: Pseudo semantic highlighting
« Reply #4 on: January 16, 2013, 11:55:41 pm »
How do I test this?
I've applied it and I see no change.
(most of the time I ignore long posts)
[strangers don't send me private messages, I'll ignore them; post a topic in the forum, but first read the rules!]

Offline Alpha

  • Developer
  • Lives here!
  • *****
  • Posts: 1513
Re: Pseudo semantic highlighting
« Reply #5 on: January 17, 2013, 12:02:49 am »
This patch currently only hooks into the editor activated event, so you need to switch editors/close reopen editors (after initial parsing has finished) for colors to show up.

Offline oBFusCATed

  • Developer
  • Lives here!
  • *****
  • Posts: 13413
    • Travis build status
Re: Pseudo semantic highlighting
« Reply #6 on: January 17, 2013, 01:26:21 am »
OK, but it doesn't work for C code...

Probably you have to look in this topic: http://forums.codeblocks.org/index.php/topic,16249.0.html

Is it possible to extract this code in a separate plugin or in core?
(most of the time I ignore long posts)
[strangers don't send me private messages, I'll ignore them; post a topic in the forum, but first read the rules!]

Offline Alpha

  • Developer
  • Lives here!
  • *****
  • Posts: 1513
Re: Pseudo semantic highlighting
« Reply #7 on: January 17, 2013, 02:43:55 am »
OK, but it doesn't work for C code...
... hence the "pseudo" ;).
The logic used is fairly simplistic:
  • List functions in the current file
    • Collect the classes they are from
  • List the classes in the current file
  • Iterate through the member variables of both lists of classes
  • Put this list of variables in Scintilla's (previously unused by Code::Blocks) keyword set for "Global classes and typedefs"
This makes a lot of assumptions about coding style, but these assumptions generally yield decent results on C++ code.  However, searching for a class in C code will fail for obvious reasons.
In C code, what would your expectations be for choosing the highlighted set?

Probably you have to look in this topic: http://forums.codeblocks.org/index.php/topic,16249.0.html
Yes... the plugin there is a lot more ambitious than what I am attempting.
Is it possible to extract this code in a separate plugin or in core?
Probably not; this makes use of the token tree that CC builds to decide on the set of keywords to highlight.

For example, add a new member variable to the class in a header  file, then create an inline method to use this member variable (i.e. a getter-method). This variable will be the only one not highlighted like the others until CC scans this file again... :D
Although the code is not necessarily expensive, I would prefer it run the fewest number of times necessary.  Do you have a recommended selection of events I should attach it to?
« Last Edit: January 17, 2013, 03:20:07 am by Alpha »

Offline Alpha

  • Developer
  • Lives here!
  • *****
  • Posts: 1513
Re: Pseudo semantic highlighting
« Reply #8 on: January 17, 2013, 03:19:28 am »
This should look a little nicer on C code (highlight global vars in C), and also deals with inherited members (for C++).

I added a lock on s_TokenTreeMutex (because that is what the rest of the code seems to do when walking through tokens), however, I do not exactly understand the concept of a mutex very well; is it needed here?

Offline MortenMacFly

  • Administrator
  • Lives here!
  • *****
  • Posts: 9694
Re: Pseudo semantic highlighting
« Reply #9 on: January 17, 2013, 04:19:37 pm »
This patch currently only hooks into the editor activated event, so you need to switch editors/close reopen editors (after initial parsing has finished) for colors to show up.
There is also the drawback, btw: I noticed really massive slow-downs when opening an editor of a large file with many references to highlight. Do you experience the same?
Compiler logging: Settings->Compiler & Debugger->tab "Other"->Compiler logging="Full command line"
C::B Manual: https://www.codeblocks.org/docs/main_codeblocks_en.html
C::B FAQ: https://wiki.codeblocks.org/index.php?title=FAQ

Offline MortenMacFly

  • Administrator
  • Lives here!
  • *****
  • Posts: 9694
Re: Pseudo semantic highlighting
« Reply #10 on: January 17, 2013, 04:21:36 pm »
however, I do not exactly understand the concept of a mutex very well; is it needed here?
A mutex is used where the tree could be accessed in parallel (i.e. from another thread), to avoid freezes. Usually accessing the token tree always requires a lock, unless it has been set from the caller function already. I'll have a look but later...
Compiler logging: Settings->Compiler & Debugger->tab "Other"->Compiler logging="Full command line"
C::B Manual: https://www.codeblocks.org/docs/main_codeblocks_en.html
C::B FAQ: https://wiki.codeblocks.org/index.php?title=FAQ

Offline Alpha

  • Developer
  • Lives here!
  • *****
  • Posts: 1513
Re: Pseudo semantic highlighting
« Reply #11 on: January 17, 2013, 04:45:07 pm »
There is also the drawback, btw: I noticed really massive slow-downs when opening an editor of a large file with many references to highlight. Do you experience the same?
I have not tried opening anything extremely large yet... is the slow-down constant, or is it a pause when you switch to the tab?  (I could probably increase performance by switching to a hash instead of an array to insure unique entries.)

This patch currently only hooks into the editor activated event, so you need to switch editors/close reopen editors (after initial parsing has finished) for colors to show up.
I forgot to mention, this second patch adds one other event: color all open editors when parsing completes.

Offline MortenMacFly

  • Administrator
  • Lives here!
  • *****
  • Posts: 9694
Re: Pseudo semantic highlighting
« Reply #12 on: January 17, 2013, 06:52:55 pm »
I have not tried opening anything extremely large yet... is the slow-down constant, or is it a pause when you switch to the tab?
It seems as soon as I switch... I'll report back once I have played the second one.
Compiler logging: Settings->Compiler & Debugger->tab "Other"->Compiler logging="Full command line"
C::B Manual: https://www.codeblocks.org/docs/main_codeblocks_en.html
C::B FAQ: https://wiki.codeblocks.org/index.php?title=FAQ

Offline Alpha

  • Developer
  • Lives here!
  • *****
  • Posts: 1513
Re: Pseudo semantic highlighting
« Reply #13 on: January 18, 2013, 04:36:00 am »
Debug timing code attached.  Which algorithm yields better performance (especially on larger files where performance actually matters) on your machine?

Keep in mind that the first run on an editor will be skewed because:
Code
        if (token->m_Ancestors.empty())
            tree->RecalcInheritanceChain(token);
will change performance after the first run (so only pay attention to later runs on an editor).

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5905
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: Pseudo semantic highlighting
« Reply #14 on: January 31, 2013, 04:06:30 pm »
You add four stopwatches, and four type of tokens(keywords) were colourised. Can you tell me what kind of tokens for what stopwatch?
1,Variable?
2,Function?
3,Class?
4,?
I'm totally confused.

When tested, I see that only scan the current files token is NOT necessary, E.g.

Code
class MyFrame: public wxFrame

You can see that "wxFrame" will not be colourised because its token belong to another source file/header.
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.