Author Topic: ThreadSearch can't open the utf8 files!  (Read 8876 times)

Offline nanyu

  • Almost regular
  • **
  • Posts: 188
  • nanyu
ThreadSearch can't open the utf8 files!
« on: August 28, 2008, 04:49:49 am »
1. windows xp.
2. codeblock svn 5195 (maybe all version)
3. save the a cpp files to two copies,  one encode by ansi, named a.cpp;
     the other is  utf8 (the file begin with  "0xEFBB"), named a_utf8.cpp;
4. check the "threadsearch" option item: "show error message if file cannot be opened."
5. press "search" button...
6. a message box popup.. "Fail to open c:\a_utf8.cpp" ..

anyone help me? thanks!

Offline dje

  • Lives here!
  • ****
  • Posts: 683
Re: ThreadSearch can't open the utf8 files!
« Reply #1 on: August 31, 2008, 10:21:22 am »
Hi !

The implementation used to open file for searching is
Code
TextFile.Open(filePath, wxConvFile)
Solution was provided by Tiwag in this thread and is for now the best I've found.
Problem is that I don't know any way to find file encoding.

Did you try to set the "default encoding when opening file" to UTF-8 in editor settings ?
With this, I have no problem searching ASCII and UTF-8 files.

Dje

Offline Biplab

  • Developer
  • Lives here!
  • *****
  • Posts: 1874
    • Biplab's Blog
Re: ThreadSearch can't open the utf8 files!
« Reply #2 on: August 31, 2008, 11:10:16 am »
The implementation used to open file for searching is
Code
TextFile.Open(filePath, wxConvFile)
Solution was provided by Tiwag in this thread and is for now the best I've found.
Problem is that I don't know any way to find file encoding.

You should use C::B's built-in mechanism to read unicode file. Please follow the following steps-

Code
#include "encodingdetector.h"

....


wxString fooFileContent;
EncodingDetector fooFile(_T("foo.cpp"));
if (fooFile.IsOK())
{
    fooFileContent = fooFile.GetWxStr();
}

Encoding detection and conversion will be done by C::B automatically. :)
Be a part of the solution, not a part of the problem.

Offline nanyu

  • Almost regular
  • **
  • Posts: 188
  • nanyu
Re: ThreadSearch can't open the utf8 files!
« Reply #3 on: September 08, 2008, 11:41:04 am »
to dje:thank you!
i had set the "default encoding when opening file" to UTF-8 in editor settings.
i am puzzled now. the files encoding utf-8, some of them can be openned by ThreadSearch, but  some of them is not.

Offline dje

  • Lives here!
  • ****
  • Posts: 683
Re: ThreadSearch can't open the utf8 files!
« Reply #4 on: September 08, 2008, 11:55:50 am »
Hi !

Biplab, I don' forget your advice, but I'm a quite busy right now and it requires a little work, as just replacing file opening with your snippet is not sufficient.

nanyu, do you know when it works or not ?
Does it depend on encoding ? I had already problems when files used different end of lines chars (mixing of editor gave such wonderful results).
Could you send a working and not working one ?

Dje

Offline nanyu

  • Almost regular
  • **
  • Posts: 188
  • nanyu
Re: ThreadSearch can't open the utf8 files!
« Reply #5 on: September 22, 2008, 07:36:28 am »
Please Unzip the Attach file.

ThreadSearch do work well with "ok.txt", but fail to open the "fail.txt".
(please open the txt file for more information)

In my CodeBlocks, i set the "utf-8" encoding.

---
windowx XP (Chinese).



[attachment deleted by admin]

Offline dje

  • Lives here!
  • ****
  • Posts: 683
Re: ThreadSearch can't open the utf8 files!
« Reply #6 on: September 22, 2008, 09:10:29 am »
Hi Nanyu !

Thanks for files, I'll check tonight (at french time ;))

Dje

Offline dje

  • Lives here!
  • ****
  • Posts: 683
Re: ThreadSearch can't open the utf8 files!
« Reply #7 on: September 22, 2008, 11:03:54 pm »
Hi nanyu,

I tried on my French XP home edition SP3 and it works as expected.
I'll try to follow Biplab advice this WE but I won't be able to do other tests than non regression.

I joined a working snapshot and my default.conf if you want to try it.

Dje

[attachment deleted by admin]