Developer forums (C::B DEVELOPMENT STRICTLY!) > Development
Wrong spell checker on russian (and probably other languages)
BlueHazzard:
The problem is that utf8 does not work on windows and we use utf8 everywhere (what is the only right thing to do). wxIspunc uses the system function std::iswpunc and this function is not really nice specified by the standard... So the word splitting won't work on windows if we do not convert it to utf16 when we want to use some unicode aware functionality. And what i can tell, for wxIspunc (aka std::iswpunc) also the local is crucial, because as i noted top, with my locale, russian characters are not detected correctly...
oBFusCATed:
Then why don't you just write a cbIspuncUtf8 and be done with it?
BlueHazzard:
Ok, here is a patch that should work on windows for all single point UTF16 code points.
On linux it works, but i do not know if there is a better way to make the iswspace() function working. On my test system (english linux mint, default locale tmp="C") without switching locale it does not work. Right now i have to set and reset the locale...
Example to test:
--- Code: ---// Hänsel und Gretel <- German dic
//числовых числовых <- Russian dic
--- End code ---
on both example the dictionary should not underline the two words separated by the dash. The dash is a unicode character to test the isSpace function. (there is a dash, in my firefox it is barely visible)
BlueHazzard:
i probably should use
--- Code: ---wxIsspace_l(wxChar, wxXLocale)
--- End code ---
and so on on linux...
oBFusCATed:
I don't know what is the exact problem but patches with calls to setlocale(LC_ALL, "en_US.utf8"); are really unacceptable.
You have no guarantees that the user has the files for this locale. It is highly unlikely that this would happen but, still.
Also setlocale modifies the locale of the whole thread, and it is slow...
Putting bad words in the comments is also unacceptable.
Why is UTF8toUTF32 returning int32_t and not uint32_t? Why is UTF32toUTF16 using plain types and not sized types?
Have you considered using these things: https://www.scintilla.org/ScintillaDoc.html#SCI_WORDENDPOSITION ?
Navigation
[0] Message Index
[#] Next page
[*] Previous page
Go to full version