Developer forums (C::B DEVELOPMENT STRICTLY!) > Development

UTF8 Encoding conversion speedup (Linux)

<< < (2/4) > >>

Jenna:
I'm certified sick this week, but lying in bed without doing anything is too boring, so I played a little bit with encoding-detection and code-conversion.

I have adapted mozillas encoding-detection for C::B.
The recognition seems to be much better.

After some other tweaking (among others using the idea behind dmoore's suggestions about not using wxCSConv if possible), I was able to speed up the loading of xmltest.cpp (blown up to 3,5 MB with multiple copies of it's content) from about 31 seconds to about 2,5 seconds.

I'm currently working on a patch that can be uploaded for others to test.

Needs some (much) code-cleanup, but if it's ready, I will put it onto my server (it's too large for an attachement, I think, because of the encoding-detection code).

EDIT:
I just tested another very large (this time UTF-8 file):

loadtime decreased from 82 seconds to less than 3 !!

Biplab:

--- Quote from: jens on February 26, 2009, 02:42:38 pm ---I'm certified sick this week, but lying in bed without doing anything is too boring, so I played a little bit with encoding-detection and code-conversion.

I have adapted mozillas encoding-detection for C::B.
The recognition seems to be much better.

--- End quote ---

We (Morten and Me) had previously proposed to include this. But this was not accepted as encoding detection of all files in a large project may take significant amount of time. This code is proven one and is still one of the best encoding detection routine available.

dmoore:

--- Quote from: Biplab on February 26, 2009, 03:07:12 pm ---We (Morten and Me) had previously proposed to include this. But this was not accepted as encoding detection of all files in a large project may take significant amount of time.

--- End quote ---

in my testing, it was the conversion not the detection that was taking a long time (maybe the balance shifts a bit on windows platforms where wxCSConv seems to do the right thing). Opening a project with ~10 large utf8 files to open could take a minute (albeit on a moderately specced pc). forget about loading big log files...


--- Quote --- This code is proven one and is still one of the best encoding detection routine available.

--- End quote ---

Do you mean mozilla's or the one in our trunk?

MortenMacFly:

--- Quote from: dmoore on February 26, 2009, 03:16:02 pm ---Do you mean mozilla's or the one in our trunk?

--- End quote ---
Mozilla's (probably in Mozilla's trunk... ;-)).

MortenMacFly:

--- Quote from: jens on February 26, 2009, 02:42:38 pm ---I'm certified sick this week

--- End quote ---
I over-read this one... I hope you get well soon - I had been sick last week... but the doc gave me just 2 days for recovery. I'll go to another next time.

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version