Author Topic: question about UTF-8 in CB  (Read 3850 times)

Offline zealkane

  • Multiple posting newcomer
  • *
  • Posts: 29
question about UTF-8 in CB
« on: June 04, 2006, 02:11:24 pm »
I'm edited source file in UTF-8 by Notepad++,but compiled error by CB as follow:
Code: [Select]
main.cpp:1: error: stray '\239' in program
main.cpp:1: error: stray '\187' in program
main.cpp:1: error: stray '\191' in program
.\TagCode.h:1: error: stray '\239' in program
.\TagCode.h:1: error: stray '\187' in program
.\TagCode.h:1: error: stray '\191' in program
:: === Build finished: 6 errors, 0 warnings ===
I setted Default encoding was UTF-8 in CB and I saw about UTF-8  in CB be called "ANSI as UTF-8" in Notepad++.
My question is "which kind of encoding i can used in CB if i want edit source file in UTF-8 as same as in Notepad++"
Zeal without knowledge is fire without light.

Offline zealkane

  • Multiple posting newcomer
  • *
  • Posts: 29
Re: question about UTF-8 in CB
« Reply #1 on: June 04, 2006, 05:29:33 pm »
I want to know what is different between UTF-8 and UTF-8(without BOM) and which kind of UTF-8 encoding was CB used?
Have somebody can tell me?
Zeal without knowledge is fire without light.

Offline kagerato

  • Multiple posting newcomer
  • *
  • Posts: 56
    • kagerato.net
Re: question about UTF-8 in CB
« Reply #2 on: June 04, 2006, 08:00:12 pm »
UTF-8 text that lacks a byte-order mark and non-ASCII characters is ASCII (ASCII is the American Standard Code for Information Interchange, a 7-bit encoding which was expanded upon with different characters to form the various ANSI encodings).  There's no means of distinguishing the two, and it was intentionally designed that way.

You may want to consider simply using UTF-16 as your encoding if you want the text to be recognized the same in all programs which support both ANSI encodings and Unicode encodings.  The greater file size doesn't tend to cost too much on filesystems using 4 KiB block sizes, because very large files fit more effectively and very small files had to use a whole block previously anyway.  The only significant concern with UTF-16 is that old programs (especially some that ran only on Win9x) are incapable of reading it.