Author Topic: Kermit attacks 7280! or hilighting rules are highlighting words it shouldn't...  (Read 19367 times)

Offline ouch

  • Almost regular
  • **
  • Posts: 223
words like "size", "erase" and "function"

wxWidgets commands uses some of these words. for example with wxstrings:

test.erase();

the word erase is green.

the word erase is a valid standard highlighting term yes, but not in this context.

Offline thomas

  • Administrator
  • Lives here!
  • *****
  • Posts: 3979
Yes, that is right. This member is one of the "STL compatibility" functions or whatever wxWidgets calls them.

As of revision 7263/7264, most (hopefully all) of STL, C++03, and C++0x has been added to the lexer.

Obviously the downside is that if something looks like STL, it is highlighted as such.
"We should forget about small efficiencies, say about 97% of the time: Premature quotation is the root of public humiliation."

Offline ouch

  • Almost regular
  • **
  • Posts: 223
are there any plans to fix this?

just checking if its surrounded by a space between the keyword and other stuff would help greatly.

Offline GeO

  • Multiple posting newcomer
  • *
  • Posts: 51
are there any plans to fix this?

Just set "User keyword" to black and uncheck bold, and you're done.

Greets GeO

[attachment deleted by admin]

Offline ouch

  • Almost regular
  • **
  • Posts: 223
Well that would mean they (and other stuff) wouldn't get highlighted at all. STL commands should get highlighted but not if the words are actually members and variables in a specific context.

But I think just checking for spaces would stop most of the highlighting errors.

Offline thomas

  • Administrator
  • Lives here!
  • *****
  • Posts: 3979
Syntax highlighting is not context sensitive (at least not in this sense), unluckily. That's not how it works. It matches some control structures, such as comments, and it can distinguish balanced and unbalanced braces, but other than that it simply matches substrings.

So unluckily, I see no way how one could fix this, other than rewriting SciTE's syntax highlighter entirely and combining it with a parser that is superior to the one we currently use for code completion. In one word: no.

(of course removing all of the STL highlighting would be another option)
"We should forget about small efficiencies, say about 97% of the time: Premature quotation is the root of public humiliation."

Offline ouch

  • Almost regular
  • **
  • Posts: 223
How about just distributing 2 different lexers? one with the STL stuff and one without. Best of both worlds. ;)

or put them in a different set. That would work too.
« Last Edit: July 07, 2011, 09:12:27 pm by ouch »

Online killerbot

  • Administrator
  • Lives here!
  • *****
  • Posts: 5529
@Thomas : I just noted some missing : cbegin, cend, crbegin, crend .

Could you add them ?

Offline oBFusCATed

  • Developer
  • Lives here!
  • *****
  • Posts: 13406
    • Travis build status
/off
killerbot: are you using c++-0x in production or in personal test projects?
(most of the time I ignore long posts)
[strangers don't send me private messages, I'll ignore them; post a topic in the forum, but first read the rules!]

Online killerbot

  • Administrator
  • Lives here!
  • *****
  • Posts: 5529
in production :-)

Offline thomas

  • Administrator
  • Lives here!
  • *****
  • Posts: 3979
@Thomas : I just noted some missing
rev 7282
"We should forget about small efficiencies, say about 97% of the time: Premature quotation is the root of public humiliation."

Online killerbot

  • Administrator
  • Lives here!
  • *****
  • Posts: 5529
Thomas, could you also add the atomic types ? I noticed they are not colored yet.

Offline oBFusCATed

  • Developer
  • Lives here!
  • *****
  • Posts: 13406
    • Travis build status
Thomas:
After using newer C::Bs having this feature for a while, I think that this feature should be removed
or you should provide a way to disable it.
At the moment it makes the code really unreadable, because random words are made bold and highlighted.
It is just noise, no real added value.

This is my view of course :)

p.s. I'm OK with compile time only disable mechanism :)
(most of the time I ignore long posts)
[strangers don't send me private messages, I'll ignore them; post a topic in the forum, but first read the rules!]

Offline thomas

  • Administrator
  • Lives here!
  • *****
  • Posts: 3979
Feel free to revert, I find it useful... but tastes surely differ. I can always keep my local copy, no problem.

Compile-time option is harsh, since the lexers aren't compiled, they're just xml files.
"We should forget about small efficiencies, say about 97% of the time: Premature quotation is the root of public humiliation."

Online killerbot

  • Administrator
  • Lives here!
  • *****
  • Posts: 5529
it is ok, but it would be really nice it it was more smart begin() on container is ok to have coloured, but then int begin = 10; here begin should not be coloured.

I guess the best is a regular setting to please everyone, then one can select.

Offline oBFusCATed

  • Developer
  • Lives here!
  • *****
  • Posts: 13406
    • Travis build status
Compile-time option is harsh, since the lexers aren't compiled, they're just xml files.
Can we have 2 lexers? One with STL and one without?
(most of the time I ignore long posts)
[strangers don't send me private messages, I'll ignore them; post a topic in the forum, but first read the rules!]

Offline Freem

  • Almost regular
  • **
  • Posts: 218
Maybe there is something that can be done.
How the compiler can know if we are using stl functions/methods when we call, by example, "cin"?
It looks if there is a "std::" before, or a "using namespace std;" previously in the file.
If the user use the stl namespace to prefix calls, just use a standard lexer. If he specified "using namespace std;" before, use another lexer, as oBFusCATed said?

I know that it is not true for the great majority of cases, where we just use a method's object. But for a few cases, maybe it is possible to do a workaround? (Something is better than nothing. BTW, we have a proberb to say what I think, but I don't know how to translate it)

I think I have read somewhere here that scintilla highlighting function can not use the context, so maybe I only said something really stupid...

Offline oBFusCATed

  • Developer
  • Lives here!
  • *****
  • Posts: 13406
    • Travis build status
If the user use the stl namespace to prefix calls, just use a standard lexer. If he specified "using namespace std;" before, use another lexer, as oBFusCATed said?
This is not what I've said. What you're talking about is semantic highlighting, which is not possible and hard to do at the moment.
What I've said is to have two lexer and the user can choose one or the other in the options.
(most of the time I ignore long posts)
[strangers don't send me private messages, I'll ignore them; post a topic in the forum, but first read the rules!]

Offline Freem

  • Almost regular
  • **
  • Posts: 218
Ok, I understand. Sorry for the useless post.

Offline thomas

  • Administrator
  • Lives here!
  • *****
  • Posts: 3979
Not useless at all. What you suggested is how it should be in the first place. Unluckily, it's not how the factory-supplied lexers in Scintilla work. They match words (not even patterns) and don't have any relation to semantic analysis. And, unluckily, you can't even add begin(), so int begin = 1; will drop through because ( and ) are not "word characters", they'll just be ignored/stripped off. Though I'm not sure whether it's the code that parses the xml files (i.e. the one we wrote) or the code inside Scintilla that does this. This is actually something that might be possible to fix rather easily.

But, all in all, we would really need to write a custom Scintilla lexer (the actual lexer program, not a definition file... harsh, but possible) that is interwined with code completion for something that is truly good and nice for everyone. Then of course, disabling CC would make syntax coloring fail...
"We should forget about small efficiencies, say about 97% of the time: Premature quotation is the root of public humiliation."

Offline Jenna

  • Administrator
  • Lives here!
  • *****
  • Posts: 7252
Then of course, disabling CC would make syntax coloring fail...

Or the default lexers are used as fallback.

Offline oBFusCATed

  • Developer
  • Lives here!
  • *****
  • Posts: 13406
    • Travis build status
Can you go back to the two lexer solution, as this is the easiest one, if possible?
(most of the time I ignore long posts)
[strangers don't send me private messages, I'll ignore them; post a topic in the forum, but first read the rules!]

Offline Alpha

  • Developer
  • Lives here!
  • *****
  • Posts: 1513
But, all in all, we would really need to write a custom Scintilla lexer (the actual lexer program, not a definition file... harsh, but possible) that is interwined with code completion for something that is truly good and nice for everyone.
If something like this is attempted, I have a feature suggestion: an option could be activated to cause Code::Blocks to collect the names of variables, functions, and/or classes the user declares, then highlight them (probably something like a very dark red so it does not get in the way, just makes it visible).

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 6079
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
But, all in all, we would really need to write a custom Scintilla lexer (the actual lexer program, not a definition file... harsh, but possible) that is interwined with code completion for something that is truly good and nice for everyone.
If something like this is attempted, I have a feature suggestion: an option could be activated to cause Code::Blocks to collect the names of variables, functions, and/or classes the user declares, then highlight them (probably something like a very dark red so it does not get in the way, just makes it visible).

Sintilla's build-in lexer only do some text matching as thomas said. when it find an identifier, it just check if it is in some keyword group, and paint it with the specified color.

See the related code:
sdk\wxscintilla\src\scintilla\lexers\LexCPP.cxx
Code
		// Determine if the current state should terminate.
switch (MaskActive(sc.state)) {
case SCE_C_OPERATOR:
sc.SetState(SCE_C_DEFAULT|activitySet);
break;
case SCE_C_NUMBER:
// We accept almost anything because of hex. and number suffixes
if (!(setWord.Contains(sc.ch) || ((sc.ch == '+' || sc.ch == '-') && (sc.chPrev == 'e' || sc.chPrev == 'E')))) {
sc.SetState(SCE_C_DEFAULT|activitySet);
}
break;
case SCE_C_IDENTIFIER:
if (!setWord.Contains(sc.ch) || (sc.ch == '.')) {
char s[1000];
if (caseSensitive) {
sc.GetCurrent(s, sizeof(s));
} else {
sc.GetCurrentLowered(s, sizeof(s));
}
if (keywords.InList(s)) {
lastWordWasUUID = strcmp(s, "uuid") == 0;
sc.ChangeState(SCE_C_WORD|activitySet);
} else if (keywords2.InList(s)) {
sc.ChangeState(SCE_C_WORD2|activitySet);
} else if (keywords4.InList(s)) {
sc.ChangeState(SCE_C_GLOBALCLASS|activitySet);
}
And the keywords group is loaded when the editor initialized, see:
Code
void EditorColourSet::Apply(HighlightLanguage lang, cbStyledTextCtrl* control)
{
    if (!control)
        return;
    control->StyleClearAll();

    if (lang == HL_NONE)
        return;

    // first load the default colours to all styles used by the actual lexer (ignoring some built-in styles)
    OptionColour* defaults = GetOptionByName(lang, _T("Default"));
    OptionSet& mset = m_Sets[lang];
    control->SetLexer(mset.m_Lexers);
    control->SetStyleBits(control->GetStyleBitsNeeded());
    if (defaults)
    {
        int countStyles = 1 << control->GetStyleBits();
        // walk until countStyles, otherwise the background-colour is only set for characters,
        // not for empty background
        for (int i = 0; i <= countStyles; ++i)
        {
            if (i < 33 || (i > 39 && i < wxSCI_STYLE_MAX))
                DoApplyStyle(control, i, defaults);
        }
    }
    // for some strange reason, when switching styles, the line numbering changes colour
    // too, though we didn't ask it to...
    // this makes sure it stays the correct colour
    control->StyleSetForeground(wxSCI_STYLE_LINENUMBER, wxSystemSettings::GetColour(wxSYS_COLOUR_BTNTEXT));

    for (unsigned int i = 0; i < mset.m_Colours.GetCount(); ++i)
    {
        OptionColour* opt = mset.m_Colours.Item(i);

        if (opt->isStyle)
        {
            DoApplyStyle(control, opt->value, opt);
        }
        else
        {
            if (opt->value == cbHIGHLIGHT_LINE)
            {
                control->SetCaretLineBackground(opt->back);
                Manager::Get()->GetConfigManager(_T("editor"))->Write(_T("/highlight_caret_line_colour"), opt->back);
            }
            else if (opt->value == cbSELECTION)
            {
                if (opt->back != wxNullColour)
                {
                    control->SetSelBackground(true, opt->back);
//                    Manager::Get()->GetConfigManager(_T("editor"))->Write(_T("/selection_colour"), opt->back);
                }
                else
                    control->SetSelBackground(false, wxColour(0xC0, 0xC0, 0xC0));

                if (opt->fore != wxNullColour)
                {
                    control->SetSelForeground(true, opt->fore);
//                    Manager::Get()->GetConfigManager(_T("editor"))->Write(_T("/selection_fgcolour"), opt->fore);
                }
                else
                    control->SetSelForeground(false, *wxBLACK);
            }
//            else
//            {
//                control->MarkerDefine(-opt->value, 1);
//                control->MarkerSetBackground(-opt->value, opt->back);
//            }
        }
    }
    for (int i = 0; i <= wxSCI_KEYWORDSET_MAX; ++i)
    {
        control->SetKeyWords(i, mset.m_Keywords[i]);
    }
    control->Colourise(0, -1); // the *most* important part!
}

So, if we have some opinion to change the
Code
mset.m_Keywords[i]
, then we can partly solve the problem.
The sintilla's lexer does not look at the syntax or semantic grammar, so if you have a function name "xxx" and a local variable name "xxx", then they will paint in the same color.
Also we can implement a self lexer, and set different color on different position, see:
http://www.scintilla.org/Lexer.txt
Mostly, it need to call a function: ColourTo(position, colorStyle);

Code
An alternative would be to use a "state-based" approach.  The outer loop
would iterate over states, like this:

  lengthDoc = startPos+lenth ;
  for ( unsigned int i = startPos ;; ) {
    char ch = styler.SafeGetCharAt(i);
    int new_state = 0 ;
    switch ( state ) {
      // scanners set new_state if they set the next state.
      case state_1: << scan to the end of state 1 >> break ;
      case state_2: << scan to the end of state 2 >> break ;
      case default_state:
        << scan to the next non-default state and set new_state >>
    }
    styler.ColourTo(i, state);
    if ( i >= lengthDoc ) break ;
    if ( ! new_state ) {
      ch = styler.SafeGetCharAt(i);
      << set state based on ch in the default state >>
    }
  }
  styler.ColourTo(lengthDoc - 1, state);

This approach might seem to be more natural.  State scanners are simpler
than character scanners because less needs to be done.  For example,
there is no need to test for the start of a C string inside the scanner
for a C comment.  Also this way makes it natural to define routines that
could be used by more than one scanner; for example, a scanToEndOfLine
routine.

I don't know how cc can help this.
« Last Edit: September 04, 2011, 12:24:33 pm by ollydbg »
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline oBFusCATed

  • Developer
  • Lives here!
  • *****
  • Posts: 13406
    • Travis build status
Back to the two lexer solution pretty please...
Semantic highlight won't happen in the next month (I guess even more), so it is not a solution to the current problem.
(most of the time I ignore long posts)
[strangers don't send me private messages, I'll ignore them; post a topic in the forum, but first read the rules!]

Offline Jenna

  • Administrator
  • Lives here!
  • *****
  • Posts: 7252
Back to the two lexer solution pretty please...
Semantic highlight won't happen in the next month (I guess even more), so it is not a solution to the current problem.
Or three ?
One for C-code, there was such a request in the last days, if I remember right.