Author Topic: Unicode conversion (attention all devs)  (Read 85924 times)

Offline rickg22

  • Lives here!
  • ****
  • Posts: 2283
OK GUYS, SDK FINISHED!!
« Reply #60 on: August 07, 2005, 12:22:13 am »
We've finished converting the SDK and SRC to Unicode!
And I converted the "small" plugins to unicode, too!

Now it's time for the big ones: Compiler, Debugger, CodeCompletion.

(See http://forums.codeblocks.org/index.php?topic=687.msg4442#msg4442 for more details).
« Last Edit: August 07, 2005, 07:02:11 am by rickg22 »

takeshimiya

  • Guest
Re: UNICODE: ATTENTION ALL DEVS
« Reply #61 on: August 07, 2005, 02:04:22 pm »
Just a note about the outdated wxString wxWidgets documentation, at the beggining it warns:
"Also please note that in this manual char is sometimes used instead of wxChar because it hasn't been fully updated yet. Please substitute as necessary and refer to the sources in case of a doubt."

The latest wxWidgets documentation updated from CVS is always here (tip: bookmark it  :wink:):
http://www.lpthe.jussieu.fr/~zeitlin/wxWindows/docs/

Another fact is that sometimes the documentation is outdated because ISN'T autogenerated by Doxygen or any autodocumentation system, it's written in LATEX.
But there was some discussion at the wx-mailing-list about porting it to Doxygen. So this is the reason of why the documentation is always outdated in some things.


takeshimiya

  • Guest
Re: UNICODE: ATTENTION ALL DEVS
« Reply #62 on: August 07, 2005, 02:08:40 pm »
A question about CVS (I know little to nothing of CVS):

All of this changes will be in a new VERSION_1 CVS branch, but, when all the _T() conversion thing is finished,
it will be easily merged in the latest CVS-HEAD branch? Or all of this work will have to be rewritten??
I'm hoping that CVS-merge is smart :D

Offline tiwag

  • Developer
  • Lives here!
  • *****
  • Posts: 1196
  • sailing away ...
    • tiwag.cb
Re: UNICODE: ATTENTION ALL DEVS
« Reply #63 on: August 07, 2005, 02:49:38 pm »
it will be easily merged in the latest CVS-HEAD branch? Or all of this work will have to be rewritten??
I'm hoping that CVS-merge is smart :D

CVS is smart - at least when the same lines of code haven't already been changed in HEAD too - then it asks for user intervention to solve the conflict.

Offline rickg22

  • Lives here!
  • ****
  • Posts: 2283
Re: UNICODE: ATTENTION ALL DEVS
« Reply #64 on: August 07, 2005, 04:32:34 pm »
CVS is smart - at least when the same lines of code haven't already been changed in HEAD too - then it asks for user intervention to solve the conflict.

Yeah, like making you edit manually all the lines in question :lol:

My proposed solution is this:

 Branch: HEAD
--------------------------------------------------------- HEAD
           |
           |
           | Branch: VERSION_1_0
           | ------ SNAPSHOT ------------------- UNICODE


I had added a SNAPSHOT_DDDDDD tag (and CVS better had not messed with it! :shock: ) for all the changes beforethe UNICODE conversion. All those files have a specific version.

The CVS HEAD branch hasn't seen much changes recently, anyway - so they probably have exactly the same version number than those branhced (with one or two exceptions maybe).

If they have the same version, copy the respective file from the latest VERSION_1_0 to HEAD.
Ta-da! :D

Or does CVS do this already? :?

Offline tiwag

  • Developer
  • Lives here!
  • *****
  • Posts: 1196
  • sailing away ...
    • tiwag.cb
Re: UNICODE: ATTENTION ALL DEVS
« Reply #65 on: August 07, 2005, 04:56:48 pm »
CVS is smart - at least when the same lines of code haven't already been changed in HEAD too - then it asks for user intervention to solve the conflict.

Yeah, like making you edit manually all the lines in question :lol:

My proposed solution is this:

 Branch: HEAD
--------------------------------------------------------- HEAD
           |
           |
           | Branch: VERSION_1_0
           | ------ SNAPSHOT ------------------- UNICODE


I had added a SNAPSHOT_DDDDDD tag (and CVS better had not messed with it! :shock: ) for all the changes beforethe UNICODE conversion. All those files have a specific version.

The CVS HEAD branch hasn't seen much changes recently, anyway - so they probably have exactly the same version number than those branhced (with one or two exceptions maybe).

If they have the same version, copy the respective file from the latest VERSION_1_0 to HEAD.
Ta-da! :D

Or does CVS do this already? :?

with cvs you can do it smarter, but in any case its not that important for now.
just now i am compiling and linking with wx-dll-unicode libs
i'll report ...

Offline tiwag

  • Developer
  • Lives here!
  • *****
  • Posts: 1196
  • sailing away ...
    • tiwag.cb
Re: UNICODE: ATTENTION ALL DEVS
« Reply #66 on: August 07, 2005, 05:24:39 pm »
ok guys  8)
built successfully with wx MSW 2.4.2 unicode
    CodeBlocks dll & exe & tools
      tinyxml
      sdk
      src
      consolerunnner

    plugin's
      Astyle
      DebuggerGDB
      DefMimeHandler
      PluginsWizard[/li]

the most important
missing till now are:
    CompilerGCC
    CodeCompletion

is somebody working on them ?  :shock:
if finished plz drop me a email with them
thanks

takeshimiya

  • Guest
Re: UNICODE: ATTENTION ALL DEVS
« Reply #67 on: August 07, 2005, 06:00:22 pm »
I guess there isn't anyone assigned to the codecompletion, but I guess it's too late for me (need to sleep):

http://forums.codeblocks.org/index.php?topic=687.msg4442#msg4442

PS: Today can be the day that Code::Blocks will see the (Unicode) light! :D
« Last Edit: August 07, 2005, 06:10:37 pm by takeshimiya »

Offline tiwag

  • Developer
  • Lives here!
  • *****
  • Posts: 1196
  • sailing away ...
    • tiwag.cb
Just say "I could compile code::blocks in Unicode!"
« Reply #68 on: August 07, 2005, 09:27:34 pm »
... Today can be the day that Code::Blocks will see the (Unicode) light! :D
you are a visionary !

a !(small) messenger log between rick_g22 and me
Quote
Rick says:
i send you the zip
tiwag says:
its OK, thanks
  Sie haben "H:\Dokumente und Einstellungen\Tiwag\Eigene Dateien\MSN_Messenger\Inbox\compilergcc_unicode.zip"
  von Rick erfolgreich erhalten.
tiwag says:
now i'll try to compile
- just a few moments ...
tiwag says:
Project   : CodeBlocks VERSION_1_0 unicode, wx242
Compiler  : GNU GCC Compiler (called directly)
Directory : D:\cpp\_projects\Codeblocks\_VERSION_1_0_UNICODE_COMPILED\src\
--------------------------------------------------------------------------------
Switching to target: plugin_CompilerGCC
mingw32-g++.exe   -Wall -g -ggdb -pipe -mthreads -fno-pcc-struct-
...-o devel\share\CodeBlocks\plugins\compilergcc.dll  -Wl,--enable-auto-image-base -Wl,--add-stdcall-alias   -lcodeblocks -lstc -lwxxrc -lwxmsw242u

Process terminated with status 0 (0 minutes, 51 seconds)
0 errors, 1 warnings
 
well done my friend !!   
Rick says:
 
Rick says:
oh no!
Rick says:
There's 1 warning!
Rick says:
 
Rick says:
(lol )
Rick says:
 
tiwag says:
nobody is perfect, not even YOU
Rick says:
hmmmm
Rick says:
"In soviet Russia, the system compiles YOU!"
Rick says:
nevermind
tiwag says:
 
tiwag sendet:
 

 
  Die Übertragung von "first_text_from_CBu.txt" wurde ausgeführt.
 
Rick says:
d with CodeB
Rick says:
was it supposed to end in "CodeB" ?
Rick says:

hello rickg !

this text i've just now edited with CodeB
tiwag says:
congratulations - you found the first bug !
Rick says:
 
tiwag says:
hey
tiwag says:
thats a big step for mankind !!
Rick says:
oh well
Rick says:
anyway
Rick says:
i think i know *WHO* is at fault with this 
tiwag says:
not even a crash when fired the first time !
Rick says:
cbeditor saves with non-unicode strings, right?
tiwag says:
thats really a good message iwould say, the rest will be also doable, i'm sure (not peanuts, but doable)
Rick says:
anyway
tiwag says:
NONONONONONO !!
Rick says:
try to load the file you've just saved
Rick says:
uh?
tiwag says:
it saved in UNICODE, the file i've sent you is in UNICODE and was saved by CBu (== CodeBlocksunicode)
Rick says:
oh - in UNICODE
Rick says:
I opened it in notepad
tiwag says:
look with a hexeditor - i did that, because i trust NOBODY
Rick says:
wait a minute
Rick says:
there should be an option to save files as UTF-8 or something, right?
Rick says:
anyway
Rick says:
if you also open it with codeblocks, does it display well?
tiwag says:
YES YES YES YES
tiwag says:
i cant believe it ! its OKOKOK
Rick says:
....h.e.l.l.o. .
r.i.c.k.g. .!...
......t.h.i.s. .
t.e.x.t. .i.'.v.
e. .j.u.s.t. .n.
o.w. .e.d.i.t.e.
d. .w.i.t.h. .C.
o.d.e.B.........
Rick says:
does the line end in "CodeB" ?
tiwag says:
that is bug no 0000000001
Rick says:
wait
Rick says:
  i got it!
tiwag says:
what
Rick says:
If the file is unicode when opened, save as unicode
Rick says:
otherwise, save as normal
tiwag says:
ok agree
tiwag says:
  but this are features !!! let them for CVS HEAD,
  our job is to build a pure UNICODE version now
Rick says:
OH
Rick says:
ok i still have to modify the codecompletion
tiwag says:
i go to announce the message in the forums -
what's your opinion ?
tiwag says:
on
"CodeBlocks_VERSION_1_0_UNICODE has seen the light! "
Rick says:
no
Rick says:
not until we have codecompletion codecompleted 
tiwag says:
codecompletion will not work anyway - believe me
Rick says:
yeah but...
Rick says:
it's better not having to have another stage of modifications
tiwag says:
i dont understand you - i only wanted to post in the forum, that we have compiled and run CBu without crash the first time !
thats all
Rick says:
ok ok...
Rick says:
Just say "I could compile code::blocks in Unicode!"

"I could compile code::blocks in Unicode!"

what i've done now !

thanks to ALL who helped the last days !!!!!!!!!!
« Last Edit: August 07, 2005, 09:37:41 pm by rickg22 »

takeshimiya

  • Guest
Re: UNICODE: ATTENTION ALL DEVS
« Reply #69 on: August 07, 2005, 09:56:49 pm »
Almost CONGRATS!  :o
(almost until codecompletion is finished)

About the File Unicode thing, C::B must save (by default) the files in ASCII mode (with options to save in UTF-8, UTF-16LE/BE as notepad haves).
Because most if not all compilers doesn't support unicode, only old plain ASCII.

And when saving text to a file in Unicode don't forget to put the BOM (Byte Order Mark) so all text editors know that is a file encoded in Unicode.
http://www.websina.com/bugzero/kb/unicode-bom.html
Of course when reading a file also take care of the BOM.

That's it =D

Offline rickg22

  • Lives here!
  • ****
  • Posts: 2283
Re: UNICODE: ATTENTION ALL DEVS
« Reply #70 on: August 07, 2005, 11:12:33 pm »
Almost CONGRATS!  :o
(almost until codecompletion is finished)

Remove that "almost", I just committed codecompletion to CVS :-)

IT'S FINISHED, YAY! :D

Edit: Emm... oops, I forgot to do the todo plugin (ironically) :oops:

Edit: Thanks to elvstone for converting the todo plugin. WE'RE FINISHED!!!!
« Last Edit: August 08, 2005, 12:41:30 am by rickg22 »

Offline rickg22

  • Lives here!
  • ****
  • Posts: 2283
Re: UNICODE: ATTENTION ALL DEVS
« Reply #71 on: August 07, 2005, 11:18:48 pm »

About the File Unicode thing, C::B must save (by default) the files in ASCII mode (with options to save in UTF-8, UTF-16LE/BE as notepad haves).
Because most if not all compilers doesn't support unicode, only old plain ASCII.

And when saving text to a file in Unicode don't forget to put the BOM (Byte Order Mark) so all text editors know that is a file encoded in Unicode.
http://www.websina.com/bugzero/kb/unicode-bom.html
Of course when reading a file also take care of the BOM.


Please enlighten us about it! I don't know how to do that :-(

Also, some plugins also read the files, I don't think they're ready to handle the difference between ANSI and Unicode yet. But at least STAGE 1 has been completed. We'll keep developing and fixing bugs.

takeshimiya

  • Guest
Re: UNICODE: ATTENTION ALL DEVS
« Reply #72 on: August 07, 2005, 11:58:11 pm »
It's official, time to Celebrate!!
Congratulations all for all the hard work, I'm sure that now your dreams contains some _() and _T() between sheeps :lol:

Here are more info:
http://www.microsoft.com/globaldev/getwr/steps/wrg_unicode.mspx

I never tried using wx, but I think that to know the BOM it's something like:

Code
When reading any text file:
wxString file; // a file loaded entirely in a string

#ifdef UNICODE
    if(file[0] == 0xEF && file[1] == 0xBB && file[2] == 0xBF)
        // it's in UTF-8, wxMBConvert-it

    else if(file[0] == 0xFE && file[1] == 0xFF)
        // it's in UTF-16BE, wxMBConvert-it

    else if(file[0] == 0xFF && file[1] == 0xFE)
        // it's in UTF-16LE, wxMBConvert-it

    else
        // it's in ASCII, wxMBConvert-it
#else
    if(file[0] == 0xEFBB)
        // it's in UTF-8, wxMBConvert-it

    else if(file[0] == 0xFEFF)
        // it's in UTF-16BE, wxMBConvert-it

    else if(file[0] == 0xFFFE)
        // it's in UTF-16LE, wxMBConvert-it

    else
        // it's in ASCII, do nothing
#endif

I don't know if this works, or if it's the best way, there was some discussion in the wx-mailing list to put this functionality in the wxMBConv classes or wxTexFile classes, etc
There was some patchs for this, I don't know what is the current state of this.

takeshimiya

  • Guest
Re: UNICODE: ATTENTION ALL DEVS
« Reply #73 on: August 08, 2005, 12:05:11 am »
But well, I hope that C::B compiles in wx2.6+UNICODE+LINUX without lots of segfaults :)

So now a lot of test in linux is requiered, test in different distros with their owns versions of wxGTK2 Unicode (specially Ubuntu, Debian, Mandriva, and Fedora, they seem to be the more widespreaded right now). Any type of testing in linux is appreciated!

Please post in http://forums.codeblocks.org/index.php?topic=317 the results

Offline rickg22

  • Lives here!
  • ****
  • Posts: 2283
Re: UNICODE: ATTENTION ALL DEVS
« Reply #74 on: August 08, 2005, 12:43:38 am »
OK then! :)

Time to move on. Let's squash some bugs! :P