Author Topic: Wrong file encode detected.  (Read 10050 times)

Offline edison

  • Multiple posting newcomer
  • *
  • Posts: 53
Wrong file encode detected.
« on: October 15, 2014, 04:58:56 am »
This is a bug which was existed long time ago.
test code:
Code: [Select]
#include <stdio.h>

int main(void)
{
    printf("Hello World! 测试");

    return 0;
}

[attachment deleted by admin]

Offline stahta01

  • Lives here!
  • ****
  • Posts: 6677
    • My Best Post
Re: Wrong file encode detected.
« Reply #1 on: October 15, 2014, 05:36:42 am »
I suggest posting a link to the file or attaching the file.

Also state the correct encoding and the wrong encoding value detected.

NOTE: If this is a program run-time issue search for the solution because it is NOT a CB issue.

It is posted somewhere on this board.

Tim S.
C Programmer working to learn more about C++ and Git.
On Windows 7 64 bit and Windows 10 32 bit.
On Debian Stretch, compiling CB Trunk against wxWidgets 3.0.
--
When in doubt, read the CB WiKi FAQ. http://wiki.codeblocks.org

Offline edison

  • Multiple posting newcomer
  • *
  • Posts: 53
Re: Wrong file encode detected.
« Reply #2 on: October 15, 2014, 05:48:50 am »
I suggest posting a link to the file or attaching the file.
Also state the correct encoding and the wrong encoding value detected.
NOTE: If this is a program run-time issue search for the solution because it is NOT a CB issue.
It is posted somewhere on this board.
Tim S.

?
I have uploaded a screenshot which include notepad++ and CB open same file. The correct one is notepad++.
It is not a good solution that to choice bypass the encode dectect.
« Last Edit: October 15, 2014, 08:33:19 am by edison »

Offline MortenMacFly

  • Administrator
  • Lives here!
  • *****
  • Posts: 9508
Re: Wrong file encode detected.
« Reply #3 on: October 15, 2014, 08:52:39 am »
This is a bug which was existed long time ago.
Sorry, but I can't reproduce. I've created a new file "main.c" copied/pasted your code snippet into it and it just looks exactly like in the forums and notepad...?!
My Settings are:
- Encoding: Windows 1252
- Use this encoding "as fallback"
- Try to detect...: OFF
- If conversion fails... : ON

However, are you sure you've saved your file in a proper file format like UTF-8?
Compiler logging: Settings->Compiler & Debugger->tab "Other"->Compiler logging="Full command line"
C::B Manual: http://www.codeblocks.org/docs/main_codeblocks_en.html
C::B FAQ: http://wiki.codeblocks.org/index.php?title=FAQ

Offline edison

  • Multiple posting newcomer
  • *
  • Posts: 53
Re: Wrong file encode detected.
« Reply #4 on: October 15, 2014, 11:13:02 am »
I have created a video for demo this issue:


The CB was ran with default settings.

You can reproduce this problem via add language in Windows CP, it is Simplified Chinese(the code page should be Windows-936 or GBK or cp936) here.
« Last Edit: October 15, 2014, 11:15:34 am by edison »

Offline MortenMacFly

  • Administrator
  • Lives here!
  • *****
  • Posts: 9508
Re: Wrong file encode detected.
« Reply #5 on: October 16, 2014, 08:33:49 am »
I have created a video for demo this issue:
I've seen this video. I am asking again:
However, are you sure you've saved your file in a proper file format like UTF-8?
From your video it seems not. Strange is also that you are not being warned about that issue. Usually C::B does so.
Compiler logging: Settings->Compiler & Debugger->tab "Other"->Compiler logging="Full command line"
C::B Manual: http://www.codeblocks.org/docs/main_codeblocks_en.html
C::B FAQ: http://wiki.codeblocks.org/index.php?title=FAQ

Offline edison

  • Multiple posting newcomer
  • *
  • Posts: 53
Re: Wrong file encode detected.
« Reply #6 on: October 17, 2014, 06:27:23 am »
From your video it seems not. Strange is also that you are not being warned about that issue. Usually C::B does so.
I had uploaded another video which show CB can not correctly detect the utf-8 file that save by itself:

Offline MortenMacFly

  • Administrator
  • Lives here!
  • *****
  • Posts: 9508
Re: Wrong file encode detected.
« Reply #7 on: October 17, 2014, 07:45:26 am »
I had uploaded another video which show CB can not correctly detect the utf-8 file that save by itself:

Well what happens is perfectly OK. As you create an UTF-8 w/o BOM and have setup windows-936 as default encoding it will be used when opening the file. There is no way you can distinguish exactly between UTF-8 and windows-936 in case you've only ANSI characters in the file.

So either you use UTF-8 with BOM or start just coding your Korean (whats-o-ever) stuff into the file. :)
Compiler logging: Settings->Compiler & Debugger->tab "Other"->Compiler logging="Full command line"
C::B Manual: http://www.codeblocks.org/docs/main_codeblocks_en.html
C::B FAQ: http://wiki.codeblocks.org/index.php?title=FAQ

Offline edison

  • Multiple posting newcomer
  • *
  • Posts: 53
Re: Wrong file encode detected.
« Reply #8 on: October 17, 2014, 08:28:17 am »
I had uploaded another video which show CB can not correctly detect the utf-8 file that save by itself:

Well what happens is perfectly OK. As you create an UTF-8 w/o BOM and have setup windows-936 as default encoding it will be used when opening the file. There is no way you can distinguish exactly between UTF-8 and windows-936 in case you've only ANSI characters in the file.

So either you use UTF-8 with BOM or start just coding your Korean (whats-o-ever) stuff into the file. :)

but why if I use defaut encode(windows-936) to save file and CB will detect it as other encode ? Is it normal? Why other editor(for example notepad++) have not such problem?

Offline MortenMacFly

  • Administrator
  • Lives here!
  • *****
  • Posts: 9508
Re:
« Reply #9 on: October 21, 2014, 10:48:19 pm »
Because with the content you have in the file you have multiple options for a valid encoding. They're is no single solution. That's handled differently by editors. That's why I said enter some characters that make it easier for the detection engine to identify your language. We are using the same mechanism Mozilla uses,btw...
Compiler logging: Settings->Compiler & Debugger->tab "Other"->Compiler logging="Full command line"
C::B Manual: http://www.codeblocks.org/docs/main_codeblocks_en.html
C::B FAQ: http://wiki.codeblocks.org/index.php?title=FAQ

Offline MortenMacFly

  • Administrator
  • Lives here!
  • *****
  • Posts: 9508
Re:
« Reply #10 on: October 21, 2014, 10:50:15 pm »
...not to forget that another perfect solution is to use a file with bom if the target compiler supports this.
Compiler logging: Settings->Compiler & Debugger->tab "Other"->Compiler logging="Full command line"
C::B Manual: http://www.codeblocks.org/docs/main_codeblocks_en.html
C::B FAQ: http://wiki.codeblocks.org/index.php?title=FAQ

Offline edison

  • Multiple posting newcomer
  • *
  • Posts: 53
Re:
« Reply #11 on: October 22, 2014, 04:58:56 am »
...not to forget that another perfect solution is to use a file with bom if the target compiler supports this.

But I had encouter a problem when using UTF8 w/BOM:
There is some un-readable charter(s) in the first line (for example, the first line should be #include xxxx, but with UTF8 w/BOM that was changed to ("??")#include xxxx in the CB editor).
« Last Edit: October 22, 2014, 05:10:49 am by edison »

Offline MortenMacFly

  • Administrator
  • Lives here!
  • *****
  • Posts: 9508
Re: Wrong file encode detected.
« Reply #12 on: October 22, 2014, 07:34:52 am »
I don't know what exactly you do wring, but it works perfectly here:

Steps:
- Create a new file
- enable to use BOM
- save as UTF-8
- close file
- re-open file
-> Result: UTF-8, no matter if I had added ANSI or unicode characters from your example.
Compiler logging: Settings->Compiler & Debugger->tab "Other"->Compiler logging="Full command line"
C::B Manual: http://www.codeblocks.org/docs/main_codeblocks_en.html
C::B FAQ: http://wiki.codeblocks.org/index.php?title=FAQ