Author Topic: Cyrillic identifiers  (Read 11787 times)

Offline petko10

  • Single posting newcomer
  • *
  • Posts: 6
Cyrillic identifiers
« on: October 07, 2011, 05:55:22 pm »
 Hey , I'm a bulgarian programmer and want to use my own language here and there when programming  but that often doesn't look good with the latin alphabet , so does anyone know how to configure C::B and GNU-GCC to work with Unicode (or something like that that would solve my problem) ?

Offline oBFusCATed

  • Developer
  • Lives here!
  • *****
  • Posts: 13413
    • Travis build status
Re: Cyrillic identifiers
« Reply #1 on: October 07, 2011, 06:08:17 pm »
It is impossible. C/C++ is meant to work only with ASCII.
If you want to write code in Cyrillic, you'll have to switch to Java or something modern.
Keep in mind that having non ASCII chars even in the comments is not recommended,
because different OSes use different encodings most of the time.
(most of the time I ignore long posts)
[strangers don't send me private messages, I'll ignore them; post a topic in the forum, but first read the rules!]

Offline thomas

  • Administrator
  • Lives here!
  • *****
  • Posts: 3979
Re: Cyrillic identifiers
« Reply #2 on: October 07, 2011, 07:08:56 pm »
Well, both Code::Blocks and GCC do support UTF-8 as it happens. For GCC, that is even the default encoding. So, all you really need to do is to make UTF-8 the default encoding in the Editor settings ("Other settings" tab).

Which, at the risk of repeating myself, should be the default setting for everyone, since if you use any other encoding, it's no surprise if GCC throws up (as Code::Blocks does not do -finput-charset).
"We should forget about small efficiencies, say about 97% of the time: Premature quotation is the root of public humiliation."

Offline petko10

  • Single posting newcomer
  • *
  • Posts: 6
Re: Cyrillic identifiers
« Reply #3 on: October 08, 2011, 05:37:48 pm »
 Well , yes , the encoding is set to UTF-8 in the editor settings and there is no problem with comments in cyrillic , but the compiler outputs a biiig sequence of errors and is apparently not happy when I use non-ASCII letters for identifiers . I chose "use as default (bypass C::B auto-detection)" also , but no luck .

Other ideas ?

Offline MortenMacFly

  • Administrator
  • Lives here!
  • *****
  • Posts: 9694
Re: Cyrillic identifiers
« Reply #4 on: October 08, 2011, 06:26:56 pm »
I use non-ASCII letters for identifiers.
What do you mean? Something like:
Code
äöü = 42;
is not allowed by the C / C++ compiler, not the IDE, that's what oBFusCATed meant when saying:
It is impossible. C/C++ is meant to work only with ASCII.
Compiler logging: Settings->Compiler & Debugger->tab "Other"->Compiler logging="Full command line"
C::B Manual: https://www.codeblocks.org/docs/main_codeblocks_en.html
C::B FAQ: https://wiki.codeblocks.org/index.php?title=FAQ

Offline Jenna

  • Administrator
  • Lives here!
  • *****
  • Posts: 7255
Re: Cyrillic identifiers
« Reply #5 on: October 08, 2011, 07:09:44 pm »
According to the documentation it should work:
Quote from: http://gcc.gnu.org/onlinedocs/cpp/Implementation_002ddefined-behavior.html#Identifier%20characters
Identifier characters.

The C and C++ standards allow identifiers to be composed of `_' and the alphanumeric characters. C++ and C99 also allow universal character names, and C99 further permits implementation-defined characters. GCC currently only permits universal character names if -fextended-identifiers is used, because the implementation of universal character names in identifiers is experimental.

The UCS and Unicode characterset is almost the same (according to wikipedia).
German umlauts also do not work for me, so either the documentation is incorrect or the intersection of ucs and utf is not as large as thought.

Offline oBFusCATed

  • Developer
  • Lives here!
  • *****
  • Posts: 13413
    • Travis build status
Re: Cyrillic identifiers
« Reply #6 on: October 08, 2011, 07:20:38 pm »
I don't understand why they even bother to implement it...
(most of the time I ignore long posts)
[strangers don't send me private messages, I'll ignore them; post a topic in the forum, but first read the rules!]

Offline petko10

  • Single posting newcomer
  • *
  • Posts: 6
Re: Cyrillic identifiers
« Reply #7 on: October 09, 2011, 06:42:40 pm »
According to the documentation it should work:
Quote from: http://gcc.gnu.org/onlinedocs/cpp/Implementation_002ddefined-behavior.html#Identifier%20characters
Identifier characters.

The C and C++ standards allow identifiers to be composed of `_' and the alphanumeric characters. C++ and C99 also allow universal character names, and C99 further permits implementation-defined characters. GCC currently only permits universal character names if -fextended-identifiers is used, because the implementation of universal character names in identifiers is experimental.

The UCS and Unicode characterset is almost the same (according to wikipedia).
German umlauts also do not work for me, so either the documentation is incorrect or the intersection of ucs and utf is not as large as thought.

So as I understand you have to add "-fextended-identifiers" to the compilation line , have you tried it  ?

I don't understand why they even bother to implement it...

Well as you may know there are some people who don't speak english natively , so it would be handy to have the option to write with your own alphabet  :roll:

 So here's what I understood :
 --C::B has no problem with UTF
 --GCC has no problem with UTF in comments , but
 --C++ by standard works only with ASCII ,so it won't compile with non-ASCII in the code
 --Still there is an experimental option to include universal letter identifiers with GCC

Now where do I try this option out , because I don't have experience with configuring the compiler ?

Offline oBFusCATed

  • Developer
  • Lives here!
  • *****
  • Posts: 13413
    • Travis build status
Re: Cyrillic identifiers
« Reply #8 on: October 09, 2011, 08:37:24 pm »
So as I understand you have to add "-fextended-identifiers" to the compilation line , have you tried it  ?
...
Now where do I try this option out , because I don't have experience with configuring the compiler ?
Project -> Build options -> Compiler -> Other options

You can put it in the global compiler options, too, most of the times this is not a good practice.

Well as you may know there are some people who don't speak english natively , so it would be handy to have the option to write with your own alphabet  :roll:
Me too, but I don't see the benefit of writing programs in anything different than English.
(most of the time I ignore long posts)
[strangers don't send me private messages, I'll ignore them; post a topic in the forum, but first read the rules!]

Offline petko10

  • Single posting newcomer
  • *
  • Posts: 6
Re: Cyrillic identifiers
« Reply #9 on: October 10, 2011, 01:25:39 pm »
Me too, but I don't see the benefit of writing programs in anything different than English.

I tried to give a hint but you kind of passed it  :) Say you're learning a second language with a different alphabet (for ex. - russian) and it happens so that C++ and all other good programming languages are in russian . Now you CAN make it all in russian ,but you'd rather name all your variables/functions in english  :arrow:

 Tryed the -fextended-identifiers but again - no luck , no difference what so ever . I'll dig in some russian forums to see if they've solved it   :idea:

Offline oBFusCATed

  • Developer
  • Lives here!
  • *****
  • Posts: 13413
    • Travis build status
Re: Cyrillic identifiers
« Reply #10 on: October 10, 2011, 01:42:11 pm »
Hm, I don't really understand your example.
 
My opinion is based on:
1. All documentation is written first in English (maybe there is some first written in Chinese  :lol:), then translated to something else, if ever.
2. 99% source in the wild is written in English.
3. 99% of the APIs are written in English.

Using non-English is useful, if you teach c/c++ to the very young, which don't know English, yet. But I guess no one is teaching c/c++ to 6-10 year old kids :)

Have you read this: http://gcc.gnu.org/wiki/FAQ#utf8_identifiers ?
It looks like you have to use \uNNNN syntax, very pretty  8)
(most of the time I ignore long posts)
[strangers don't send me private messages, I'll ignore them; post a topic in the forum, but first read the rules!]

Offline petko10

  • Single posting newcomer
  • *
  • Posts: 6
Re: Cyrillic identifiers
« Reply #11 on: October 10, 2011, 02:19:57 pm »
 On the side continuing with the example   :lol: (btw have you learned a language other than english?):

Code
float променлива1,пром2,ПрозорецШирочина,ПрозорецВисочина; //variable1,var2,WindowHeight,WindowLength 

feels sooooo much better than

Quote
float promenliva1,prom2,ProzorecShirochina,ProzorecVisochina;

So yes , all the documentation's in english but my programs aren't generally in it (at least a lot of the identifiers) .

About the link - yes I found that earlier but that method seems just .. "lovely" .

Didn't find any russian solutions either so that's that, I guess I'll stick with the latin alphabet until GCC finishes the option or until I make some freak of a preprocessor program (but that seems like too much effort..) .

Offline oBFusCATed

  • Developer
  • Lives here!
  • *****
  • Posts: 13413
    • Travis build status
Re: Cyrillic identifiers
« Reply #12 on: October 10, 2011, 02:30:49 pm »
Code
Всъщност изглежда ужастно, не съм сляп-патриот :) 
Translation: In fact it looks awful, I'm not blind-patriot

p.s. Admins, sorry for the non-english content.
(most of the time I ignore long posts)
[strangers don't send me private messages, I'll ignore them; post a topic in the forum, but first read the rules!]

Offline petko10

  • Single posting newcomer
  • *
  • Posts: 6
Re: Cyrillic identifiers
« Reply #13 on: October 10, 2011, 03:03:32 pm »
Code
Всъщност изглежда ужастно, не съм сляп-патриот :) 
Translation: In fact it looks awful, I'm not blind-patriot

p.s. Admins, sorry for the non-english content.


So are you a programmer or you just write here for kicks ? Not that you need to be , it's just that I thought you'd be able to follow my train of thought there for a sec .