Author Topic: Kermit attacks 7280! or hilighting rules are highlighting words it shouldn't...  (Read 14906 times)

Offline oBFusCATed

  • Developer
  • Lives here!
  • *****
  • Posts: 13413
    • Travis build status
Compile-time option is harsh, since the lexers aren't compiled, they're just xml files.
Can we have 2 lexers? One with STL and one without?
(most of the time I ignore long posts)
[strangers don't send me private messages, I'll ignore them; post a topic in the forum, but first read the rules!]

Offline Freem

  • Almost regular
  • **
  • Posts: 219
Maybe there is something that can be done.
How the compiler can know if we are using stl functions/methods when we call, by example, "cin"?
It looks if there is a "std::" before, or a "using namespace std;" previously in the file.
If the user use the stl namespace to prefix calls, just use a standard lexer. If he specified "using namespace std;" before, use another lexer, as oBFusCATed said?

I know that it is not true for the great majority of cases, where we just use a method's object. But for a few cases, maybe it is possible to do a workaround? (Something is better than nothing. BTW, we have a proberb to say what I think, but I don't know how to translate it)

I think I have read somewhere here that scintilla highlighting function can not use the context, so maybe I only said something really stupid...

Offline oBFusCATed

  • Developer
  • Lives here!
  • *****
  • Posts: 13413
    • Travis build status
If the user use the stl namespace to prefix calls, just use a standard lexer. If he specified "using namespace std;" before, use another lexer, as oBFusCATed said?
This is not what I've said. What you're talking about is semantic highlighting, which is not possible and hard to do at the moment.
What I've said is to have two lexer and the user can choose one or the other in the options.
(most of the time I ignore long posts)
[strangers don't send me private messages, I'll ignore them; post a topic in the forum, but first read the rules!]

Offline Freem

  • Almost regular
  • **
  • Posts: 219
Ok, I understand. Sorry for the useless post.

Offline thomas

  • Administrator
  • Lives here!
  • *****
  • Posts: 3979
Not useless at all. What you suggested is how it should be in the first place. Unluckily, it's not how the factory-supplied lexers in Scintilla work. They match words (not even patterns) and don't have any relation to semantic analysis. And, unluckily, you can't even add begin(), so int begin = 1; will drop through because ( and ) are not "word characters", they'll just be ignored/stripped off. Though I'm not sure whether it's the code that parses the xml files (i.e. the one we wrote) or the code inside Scintilla that does this. This is actually something that might be possible to fix rather easily.

But, all in all, we would really need to write a custom Scintilla lexer (the actual lexer program, not a definition file... harsh, but possible) that is interwined with code completion for something that is truly good and nice for everyone. Then of course, disabling CC would make syntax coloring fail...
"We should forget about small efficiencies, say about 97% of the time: Premature quotation is the root of public humiliation."

Offline Jenna

  • Administrator
  • Lives here!
  • *****
  • Posts: 7255
Then of course, disabling CC would make syntax coloring fail...

Or the default lexers are used as fallback.

Offline oBFusCATed

  • Developer
  • Lives here!
  • *****
  • Posts: 13413
    • Travis build status
Can you go back to the two lexer solution, as this is the easiest one, if possible?
(most of the time I ignore long posts)
[strangers don't send me private messages, I'll ignore them; post a topic in the forum, but first read the rules!]

Offline Alpha

  • Developer
  • Lives here!
  • *****
  • Posts: 1513
But, all in all, we would really need to write a custom Scintilla lexer (the actual lexer program, not a definition file... harsh, but possible) that is interwined with code completion for something that is truly good and nice for everyone.
If something like this is attempted, I have a feature suggestion: an option could be activated to cause Code::Blocks to collect the names of variables, functions, and/or classes the user declares, then highlight them (probably something like a very dark red so it does not get in the way, just makes it visible).

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5930
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
But, all in all, we would really need to write a custom Scintilla lexer (the actual lexer program, not a definition file... harsh, but possible) that is interwined with code completion for something that is truly good and nice for everyone.
If something like this is attempted, I have a feature suggestion: an option could be activated to cause Code::Blocks to collect the names of variables, functions, and/or classes the user declares, then highlight them (probably something like a very dark red so it does not get in the way, just makes it visible).

Sintilla's build-in lexer only do some text matching as thomas said. when it find an identifier, it just check if it is in some keyword group, and paint it with the specified color.

See the related code:
sdk\wxscintilla\src\scintilla\lexers\LexCPP.cxx
Code
		// Determine if the current state should terminate.
switch (MaskActive(sc.state)) {
case SCE_C_OPERATOR:
sc.SetState(SCE_C_DEFAULT|activitySet);
break;
case SCE_C_NUMBER:
// We accept almost anything because of hex. and number suffixes
if (!(setWord.Contains(sc.ch) || ((sc.ch == '+' || sc.ch == '-') && (sc.chPrev == 'e' || sc.chPrev == 'E')))) {
sc.SetState(SCE_C_DEFAULT|activitySet);
}
break;
case SCE_C_IDENTIFIER:
if (!setWord.Contains(sc.ch) || (sc.ch == '.')) {
char s[1000];
if (caseSensitive) {
sc.GetCurrent(s, sizeof(s));
} else {
sc.GetCurrentLowered(s, sizeof(s));
}
if (keywords.InList(s)) {
lastWordWasUUID = strcmp(s, "uuid") == 0;
sc.ChangeState(SCE_C_WORD|activitySet);
} else if (keywords2.InList(s)) {
sc.ChangeState(SCE_C_WORD2|activitySet);
} else if (keywords4.InList(s)) {
sc.ChangeState(SCE_C_GLOBALCLASS|activitySet);
}
And the keywords group is loaded when the editor initialized, see:
Code
void EditorColourSet::Apply(HighlightLanguage lang, cbStyledTextCtrl* control)
{
    if (!control)
        return;
    control->StyleClearAll();

    if (lang == HL_NONE)
        return;

    // first load the default colours to all styles used by the actual lexer (ignoring some built-in styles)
    OptionColour* defaults = GetOptionByName(lang, _T("Default"));
    OptionSet& mset = m_Sets[lang];
    control->SetLexer(mset.m_Lexers);
    control->SetStyleBits(control->GetStyleBitsNeeded());
    if (defaults)
    {
        int countStyles = 1 << control->GetStyleBits();
        // walk until countStyles, otherwise the background-colour is only set for characters,
        // not for empty background
        for (int i = 0; i <= countStyles; ++i)
        {
            if (i < 33 || (i > 39 && i < wxSCI_STYLE_MAX))
                DoApplyStyle(control, i, defaults);
        }
    }
    // for some strange reason, when switching styles, the line numbering changes colour
    // too, though we didn't ask it to...
    // this makes sure it stays the correct colour
    control->StyleSetForeground(wxSCI_STYLE_LINENUMBER, wxSystemSettings::GetColour(wxSYS_COLOUR_BTNTEXT));

    for (unsigned int i = 0; i < mset.m_Colours.GetCount(); ++i)
    {
        OptionColour* opt = mset.m_Colours.Item(i);

        if (opt->isStyle)
        {
            DoApplyStyle(control, opt->value, opt);
        }
        else
        {
            if (opt->value == cbHIGHLIGHT_LINE)
            {
                control->SetCaretLineBackground(opt->back);
                Manager::Get()->GetConfigManager(_T("editor"))->Write(_T("/highlight_caret_line_colour"), opt->back);
            }
            else if (opt->value == cbSELECTION)
            {
                if (opt->back != wxNullColour)
                {
                    control->SetSelBackground(true, opt->back);
//                    Manager::Get()->GetConfigManager(_T("editor"))->Write(_T("/selection_colour"), opt->back);
                }
                else
                    control->SetSelBackground(false, wxColour(0xC0, 0xC0, 0xC0));

                if (opt->fore != wxNullColour)
                {
                    control->SetSelForeground(true, opt->fore);
//                    Manager::Get()->GetConfigManager(_T("editor"))->Write(_T("/selection_fgcolour"), opt->fore);
                }
                else
                    control->SetSelForeground(false, *wxBLACK);
            }
//            else
//            {
//                control->MarkerDefine(-opt->value, 1);
//                control->MarkerSetBackground(-opt->value, opt->back);
//            }
        }
    }
    for (int i = 0; i <= wxSCI_KEYWORDSET_MAX; ++i)
    {
        control->SetKeyWords(i, mset.m_Keywords[i]);
    }
    control->Colourise(0, -1); // the *most* important part!
}

So, if we have some opinion to change the
Code
mset.m_Keywords[i]
, then we can partly solve the problem.
The sintilla's lexer does not look at the syntax or semantic grammar, so if you have a function name "xxx" and a local variable name "xxx", then they will paint in the same color.
Also we can implement a self lexer, and set different color on different position, see:
http://www.scintilla.org/Lexer.txt
Mostly, it need to call a function: ColourTo(position, colorStyle);

Code
An alternative would be to use a "state-based" approach.  The outer loop
would iterate over states, like this:

  lengthDoc = startPos+lenth ;
  for ( unsigned int i = startPos ;; ) {
    char ch = styler.SafeGetCharAt(i);
    int new_state = 0 ;
    switch ( state ) {
      // scanners set new_state if they set the next state.
      case state_1: << scan to the end of state 1 >> break ;
      case state_2: << scan to the end of state 2 >> break ;
      case default_state:
        << scan to the next non-default state and set new_state >>
    }
    styler.ColourTo(i, state);
    if ( i >= lengthDoc ) break ;
    if ( ! new_state ) {
      ch = styler.SafeGetCharAt(i);
      << set state based on ch in the default state >>
    }
  }
  styler.ColourTo(lengthDoc - 1, state);

This approach might seem to be more natural.  State scanners are simpler
than character scanners because less needs to be done.  For example,
there is no need to test for the start of a C string inside the scanner
for a C comment.  Also this way makes it natural to define routines that
could be used by more than one scanner; for example, a scanToEndOfLine
routine.

I don't know how cc can help this.
« Last Edit: September 04, 2011, 12:24:33 pm by ollydbg »
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline oBFusCATed

  • Developer
  • Lives here!
  • *****
  • Posts: 13413
    • Travis build status
Back to the two lexer solution pretty please...
Semantic highlight won't happen in the next month (I guess even more), so it is not a solution to the current problem.
(most of the time I ignore long posts)
[strangers don't send me private messages, I'll ignore them; post a topic in the forum, but first read the rules!]

Offline Jenna

  • Administrator
  • Lives here!
  • *****
  • Posts: 7255
Back to the two lexer solution pretty please...
Semantic highlight won't happen in the next month (I guess even more), so it is not a solution to the current problem.
Or three ?
One for C-code, there was such a request in the last days, if I remember right.