Author Topic: ² character bug  (Read 16873 times)

Offline pxnet

  • Single posting newcomer
  • *
  • Posts: 4
² character bug
« on: June 27, 2008, 11:30:39 pm »
Hi,

I didn't know where to post that... So I put it here.

When I write a ² in my code, and I do any action in Code::Blocks after that, it totally crashes.

I have the 8.02 version, build feb 27 2008, 20:59:09.

Offline killerbot

  • Administrator
  • Lives here!
  • *****
  • Posts: 5490
Re: ² character bug
« Reply #1 on: June 28, 2008, 09:02:13 am »
tried it out on linux : svn 5105 : works for me

Offline pxnet

  • Single posting newcomer
  • *
  • Posts: 4
Re: ² character bug
« Reply #2 on: June 28, 2008, 12:28:40 pm »
I'm on Windows.
Just put ² in a source file (I tried cpp, hpp... But dunno if it is that relevant), then save it. You'll then have to kill the CB process as it takes 100% CPU.

Offline Jenna

  • Administrator
  • Lives here!
  • *****
  • Posts: 7255
Re: ² character bug
« Reply #3 on: June 28, 2008, 12:43:21 pm »
It's most likely a file-encoding problem.
What's your file-encoding (You can see it in the status-line) ?
I just tried on w2k with "windows-1252", "iso-8859-1" and "utf-8" and no problems with svn5095.

Offline pxnet

  • Single posting newcomer
  • *
  • Posts: 4
Re: ² character bug
« Reply #4 on: June 28, 2008, 02:02:37 pm »
I was in "defaut" encoding, I just tested with ISO-8859-1, UTF-8,  unicode default, and that bugs as well.

OMG NOO I just lost my main cpp file with that f***ing encoding !!!

I dont experiment anymore !
« Last Edit: June 28, 2008, 02:30:19 pm by pxnet »

Offline XayC

  • Multiple posting newcomer
  • *
  • Posts: 94
Re: ² character bug
« Reply #5 on: June 28, 2008, 04:16:06 pm »
I can confirm this and I'm going to add some details.
To reproduce this bug create a new file and insert (copy-paste is ok) the ² character and save it: Code::Blocks will start using 100% CPU and you have to kill it. Note that this happens only if the ² character is the first one in the file.

And I agree with Jens, this looks like a file-encoding problem.

XayC

Offline pxnet

  • Single posting newcomer
  • *
  • Posts: 4
Re: ² character bug
« Reply #6 on: June 28, 2008, 05:18:18 pm »
Note that this happens only if the ² character is the first one in the file.

For my part, I just have to put it anywhere in the file in order it to crash.

Offline Jenna

  • Administrator
  • Lives here!
  • *****
  • Posts: 7255
Re: ² character bug
« Reply #7 on: June 28, 2008, 05:48:59 pm »
I can confirm this and I'm going to add some details.
To reproduce this bug create a new file and insert (copy-paste is ok) the ² character and save it: Code::Blocks will start using 100% CPU and you have to kill it. Note that this happens only if the ² character is the first one in the file.
I cannot confirm, neither on w2k nor on debian, even if  "²" is the first character in file.

I was in "defaut" encoding, I just tested with ISO-8859-1, UTF-8,  unicode default, and that bugs as well.

What's your default encoding ? It depends on your windows and/or on your C::B settings.
Are there any other special (non-ascii) characters in file (cyrillic, chinese ...) ?

Offline XayC

  • Multiple posting newcomer
  • *
  • Posts: 94
Re: ² character bug
« Reply #8 on: June 28, 2008, 06:27:39 pm »
I did some more tests and I have this problem only if the ² is the first character in the file, and I have it no matter what encoding I use (default WINDOWS-1252). If I put any other usual character before the ² it works.

Are there any other special (non-ascii) characters in file (cyrillic, chinese ...) ?
No, depending on the test I did, either only the ² character or the ² character and some 'a' characters.

Since I can reproduce it I'm going to debug CB and try to find out what is causing this problem.

Regards, XayC

Offline Biplab

  • Developer
  • Lives here!
  • *****
  • Posts: 1874
    • Biplab's Blog
Re: ² character bug
« Reply #9 on: June 28, 2008, 06:43:11 pm »
I can confirm this bug. But encoding handling code is not the culprit here as the file is saved properly.

Disable Code Completion plugin. It's causing this lock-up.
Be a part of the solution, not a part of the problem.

Offline Jenna

  • Administrator
  • Lives here!
  • *****
  • Posts: 7255
Re: ² character bug
« Reply #10 on: June 28, 2008, 10:09:47 pm »
I can confirm this bug. But encoding handling code is not the culprit here as the file is saved properly.

Disable Code Completion plugin. It's causing this lock-up.

I just did a complete install of C::B 8.02 with all plugins and tested it.
But filesaving works as normal, egally whether I have a "²" in file or not.
Code Completion is enabled and I also tested it with the C::B sources, because it's a quite large project, but there is no difference between filesaving with or without the "²".
I tested with "UTF-8" and "WINDOWS-1252".

Offline Biplab

  • Developer
  • Lives here!
  • *****
  • Posts: 1874
    • Biplab's Blog
Re: ² character bug
« Reply #11 on: June 29, 2008, 08:59:31 am »
I just did a complete install of C::B 8.02 with all plugins and tested it.
But filesaving works as normal, egally whether I have a "²" in file or not.
Code Completion is enabled and I also tested it with the C::B sources, because it's a quite large project, but there is no difference between filesaving with or without the "²".
I tested with "UTF-8" and "WINDOWS-1252".

I tested on Windows XP and this bug can be reproduced even for latest revisions. Though on Linux this is not reproducible which makes the task of detecting the buggy code more difficult.
Be a part of the solution, not a part of the problem.

Offline Jenna

  • Administrator
  • Lives here!
  • *****
  • Posts: 7255
Re: ² character bug
« Reply #12 on: June 29, 2008, 10:55:30 am »
I can confirm now for Win XP.
I's funny, it does not happen on linux (at least on the versions I use) and on w2k.

If I have some time I try to debug it.

The file get's saved, but the Notebook-state does not change (the star remains there) that gives a hint where (or maybe better when it happens).

Offline Jenna

  • Administrator
  • Lives here!
  • *****
  • Posts: 7255
Re: ² character bug
« Reply #13 on: June 29, 2008, 11:03:40 pm »
It happens because wxIsdigit() treats "²" and "³" as digit on XP.
This leads to an endless loop in "tokenizer.cpp" (in "DoGetToken()"), because both characters are not in the list of characters that get filtered out.

I can provide two patches:

The first one puts both characters to the list used by CharInString in "Do GetToken()", but that would make the behaviour different on different systems and I think that's not good. And in fact "²" and "³" are not really digits, and therefore should not be used in code.

The second patch filters out both characters and treats them as normal characters. That leads to the same behaviour on linux, w2k and winxp.
The characters can't be used directly in source-code, instead their ascii-code has to be used.

First patch (not really good, but works):
Code
Index: src/plugins/codecompletion/parser/tokenizer.cpp
===================================================================
--- src/plugins/codecompletion/parser/tokenizer.cpp (revision 5106)
+++ src/plugins/codecompletion/parser/tokenizer.cpp (working copy)
@@ -560,7 +560,10 @@
     else if (wxIsdigit(CurrentChar()))
     {
         // numbers
-        while (NotEOF() && CharInString(CurrentChar(), _T("0123456789.abcdefABCDEFXxLl")))
+        wxString tmp=_T("0123456789.abcdefABCDEFXxLl");
+        tmp << wxChar(178);
+        tmp << wxChar(179);
+        while (NotEOF() && CharInString(CurrentChar(), tmp))
             MoveToNextChar();
         if (IsEOF())
             return wxEmptyString;

Second patch (better):
Code
Index: src/plugins/codecompletion/parser/tokenizer.cpp
===================================================================
--- src/plugins/codecompletion/parser/tokenizer.cpp (revision 5106)
+++ src/plugins/codecompletion/parser/tokenizer.cpp (working copy)
@@ -557,6 +557,11 @@
         m_Str = m_Buffer.Mid(start, m_TokenIndex - start);
         m_IsOperator = m_Str.IsSameAs(TokenizerConsts::operator_str);
     }
+    else if (c == 178  || c == 179)
+    {
+        m_Str = c;
+        MoveToNextChar();
+    }
     else if (wxIsdigit(CurrentChar()))
     {
         // numbers

Both patches work on linux and windows.

Offline Biplab

  • Developer
  • Lives here!
  • *****
  • Posts: 1874
    • Biplab's Blog
Re: ² character bug
« Reply #14 on: June 30, 2008, 05:16:52 pm »
It happens because wxIsdigit() treats "²" and "³" as digit on XP.

This seems to be a bug in MS Runtime (Most likely). I compiled the following code with GCC, Borland C++, MSVC 8, Digital Mars 8.5, OpenWatcom 1.8.

Code
#include <iostream>
#include <tchar.h>


using namespace std;

int main()
{
    int i;
    for (i = 128; i < 255; ++i)
    {
        if (iswdigit(i))
            cout << i << " is a digit" << endl;
    }
    return 0;
}

Only GCC & MSVC 8 returns the following output.
Quote
178 is a digit
179 is a digit
185 is a digit

Process returned 0 (0x0)   execution time : 0.125 s
Press any key to continue.

So you can see that there is another character which can cause this bug. :)
Be a part of the solution, not a part of the problem.