Author Topic: Code completion using LSP and clangd  (Read 163517 times)

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5910
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: Code completion using LSP and clangd
« Reply #195 on: September 30, 2022, 08:25:07 am »
I'm not quite understand the code, when I read the source code of clangd_client, I see this:

Code
        size_t resultCount = pJson->at("result").size();
        if (not resultCount) return;

        // Nothing for ShowCalltip is ever in the signature array //(ph 2021/11/1)
        // Show Tootip vs ShowCalltip is so damn confusing !!!
        // **debugging**std::string dumpit = pJson->dump();

        size_t signatureCount = pJson->at("result").at("signatures").size();
        if (not signatureCount) return;

        json signatures = pJson->at("result").at("signatures");
        for (size_t labelndx=0; labelndx<signatureCount && labelndx<10; ++labelndx)
        {
                wxString labelValue = signatures[labelndx].at("label").get<std::string>();
                v_SignatureTokens.push_back(cbCodeCompletionPlugin::CCCallTip(labelValue));
        }

I'm not sure, but it looks like:

Code
wxString labelValue = signatures[labelndx].at("label").get<std::string>();

get<std::string>() should return a std::string.

Do we need to convert it to wxString?
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5910
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: Code completion using LSP and clangd
« Reply #196 on: September 30, 2022, 09:14:17 am »
OK, I think I have fixed this issue by using such patch:

Code
From 8523dd2bd9a58d1780c3d2efe9459f7e5fccfb41 Mon Sep 17 00:00:00 2001
From: hide<hide@hide.hide>
Date: Fri, 30 Sep 2022 15:11:28 +0800
Subject: fix the wrong tip code when Chinese comment is used


diff --git a/clangd_client/src/codecompletion/parser/parser.cpp b/clangd_client/src/codecompletion/parser/parser.cpp
index 2b9c5ea..3ddde25 100644
--- a/clangd_client/src/codecompletion/parser/parser.cpp
+++ b/clangd_client/src/codecompletion/parser/parser.cpp
@@ -2546,7 +2546,8 @@ void Parser::OnLSP_HoverResponse(wxCommandEvent& event, std::vector<ClgdCCToken>
         if (not valueItemsCount) return;
 
         json contents = pJson->at("result").at("contents");
-        wxString contentsValue = contents.at("value").get<std::string>();
+        std::string contentsValueStdString = contents.at("value").get<std::string>();
+        wxString contentsValue(contentsValueStdString.c_str(), wxConvUTF8);
 
         // Example Hover contents: L"instance-method HelloWxWorldFrame::OnAbout\n\nType: void\nParameters:\n- wxCommandEvent & event\n\n// In HelloWxWorldFrame\nprivate: void HelloWxWorldFrame::OnAbout(wxCommandEvent &event)"
         // get string array of hover info separated at /n chars.
@@ -2670,7 +2671,8 @@ void Parser::OnLSP_SignatureHelpResponse(wxCommandEvent& event, std::vector<cbCo
         json signatures = pJson->at("result").at("signatures");
         for (size_t labelndx=0; labelndx<signatureCount && labelndx<10; ++labelndx)
         {
-                wxString labelValue = signatures[labelndx].at("label").get<std::string>();
+                std::string labelValueStdString = signatures[labelndx].at("label").get<std::string>();
+                wxString labelValue(labelValueStdString.c_str(), wxConvUTF8);
                 v_SignatureTokens.push_back(cbCodeCompletionPlugin::CCCallTip(labelValue));
         }
 


I'm not sure the second hunk is needed, but the first hunk is the true fix.
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline sodev

  • Regular
  • ***
  • Posts: 497
Re: Code completion using LSP and clangd
« Reply #197 on: September 30, 2022, 08:22:14 pm »
Since i have seen quite some string encoding related issues in this thread and many try-and-error attempts to solve them, i want to add my two cents to these issues.

Never do this:
Code
wxString contentsValue = contents.at("value").get<std::string>();

This converts the std::string into a wxString using the currently set C++ locale. A locale set in CodeBlocks. But this std::string does not come from CodeBlocks. Also, std::string has, at least on Windows, no support for UTF-8. However, this doesn't stop anyone from putting UTF-8 into such a string. As long as you don't use methods that depend on the locale, this is fine. The code snippet above does depend on the locale.

Now these two lines:
Code
std::string contentsValueStdString = contents.at("value").get<std::string>();
wxString contentsValue(contentsValueStdString.c_str(), wxConvUTF8);

These lines manually convert the std::string to a wxString by telling the wxString object that the std::string does contain UTF-8. Since the user post says this does fix the issue, apparently there is UTF-8 inside that std::string.

I suggest you figure out what encoding Clang does use and then check your code if you rely anywhere else on such automatic conversions. Also, wxWidgets offers the build option wxNO_UNSAFE_WXSTRING_CONV to disable such implicit conversions, but i am not sure if this does also work for std::string, they mention only C-Strings.

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5910
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: Code completion using LSP and clangd
« Reply #198 on: October 01, 2022, 01:47:51 am »
Since i have seen quite some string encoding related issues in this thread and many try-and-error attempts to solve them, i want to add my two cents to these issues.

Never do this:
Code
wxString contentsValue = contents.at("value").get<std::string>();

This converts the std::string into a wxString using the currently set C++ locale. A locale set in CodeBlocks. But this std::string does not come from CodeBlocks. Also, std::string has, at least on Windows, no support for UTF-8. However, this doesn't stop anyone from putting UTF-8 into such a string. As long as you don't use methods that depend on the locale, this is fine. The code snippet above does depend on the locale.

Now these two lines:
Code
std::string contentsValueStdString = contents.at("value").get<std::string>();
wxString contentsValue(contentsValueStdString.c_str(), wxConvUTF8);

These lines manually convert the std::string to a wxString by telling the wxString object that the std::string does contain UTF-8. Since the user post says this does fix the issue, apparently there is UTF-8 inside that std::string.

I suggest you figure out what encoding Clang does use and then check your code if you rely anywhere else on such automatic conversions. Also, wxWidgets offers the build option wxNO_UNSAFE_WXSTRING_CONV to disable such implicit conversions, but i am not sure if this does also work for std::string, they mention only C-Strings.

Hi, sodev, thanks for the advice.

If I remember correctly, the clangd_client use the UTF-8 format for it's input source. Normally I use UTF-8 for my source code, but my system(Windows) locale is not UTF-8.

I see that clangd's document: Protocol extensions UTF-8 offsets

It said it can support UTF-8.
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline Pecan

  • Plugin developer
  • Lives here!
  • ****
  • Posts: 2750
Re: Code completion using LSP and clangd
« Reply #199 on: October 01, 2022, 09:35:41 pm »
Since i have seen quite some string encoding related issues in this thread and many try-and-error attempts to solve them, i want to add my two cents to these issues.

Never do this:
Code
wxString contentsValue = contents.at("value").get<std::string>();

This converts the std::string into a wxString using the currently set C++ locale. A locale set in CodeBlocks. But this std::string does not come from CodeBlocks. Also, std::string has, at least on Windows, no support for UTF-8. However, this doesn't stop anyone from putting UTF-8 into such a string. As long as you don't use methods that depend on the locale, this is fine. The code snippet above does depend on the locale.

Now these two lines:
Code
std::string contentsValueStdString = contents.at("value").get<std::string>();
wxString contentsValue(contentsValueStdString.c_str(), wxConvUTF8);

These lines manually convert the std::string to a wxString by telling the wxString object that the std::string does contain UTF-8. Since the user post says this does fix the issue, apparently there is UTF-8 inside that std::string.

I suggest you figure out what encoding Clang does use and then check your code if you rely anywhere else on such automatic conversions. Also, wxWidgets offers the build option wxNO_UNSAFE_WXSTRING_CONV to disable such implicit conversions, but i am not sure if this does also work for std::string, they mention only C-Strings.

Hi, sodev, thanks for the advice.

If I remember correctly, the clangd_client use the UTF-8 format for it's input source. Normally I use UTF-8 for my source code, but my system(Windows) locale is not UTF-8.

I see that clangd's document: Protocol extensions UTF-8 offsets

It said it can support UTF-8.

@sodev
@ollydbg

Thanks for this. I'll get to work checking every use of std::string to wxString in the source.

Clangd by default uses utf16. But it allows the use of utf8 as an option.

Offline Pecan

  • Plugin developer
  • Lives here!
  • ****
  • Posts: 2750
Re: Code completion using LSP and clangd
« Reply #200 on: October 03, 2022, 07:43:47 pm »
@ollydbg

OK, I think I have fixed this issue by using such patch:

Code
From 8523dd2bd9a58d1780c3d2efe9459f7e5fccfb41 Mon Sep 17 00:00:00 2001
From: hide<hide@hide.hide>
Date: Fri, 30 Sep 2022 15:11:28 +0800
Subject: fix the wrong tip code when Chinese comment is used


diff --git a/clangd_client/src/codecompletion/parser/parser.cpp b/clangd_client/src/codecompletion/parser/parser.cpp
index 2b9c5ea..3ddde25 100644
--- a/clangd_client/src/codecompletion/parser/parser.cpp
+++ b/clangd_client/src/codecompletion/parser/parser.cpp
@@ -2546,7 +2546,8 @@ void Parser::OnLSP_HoverResponse(wxCommandEvent& event, std::vector<ClgdCCToken>
         if (not valueItemsCount) return;
 
         json contents = pJson->at("result").at("contents");
-        wxString contentsValue = contents.at("value").get<std::string>();
+        std::string contentsValueStdString = contents.at("value").get<std::string>();
+        wxString contentsValue(contentsValueStdString.c_str(), wxConvUTF8);
 
         // Example Hover contents: L"instance-method HelloWxWorldFrame::OnAbout\n\nType: void\nParameters:\n- wxCommandEvent & event\n\n// In HelloWxWorldFrame\nprivate: void HelloWxWorldFrame::OnAbout(wxCommandEvent &event)"
         // get string array of hover info separated at /n chars.
@@ -2670,7 +2671,8 @@ void Parser::OnLSP_SignatureHelpResponse(wxCommandEvent& event, std::vector<cbCo
         json signatures = pJson->at("result").at("signatures");
         for (size_t labelndx=0; labelndx<signatureCount && labelndx<10; ++labelndx)
         {
-                wxString labelValue = signatures[labelndx].at("label").get<std::string>();
+                std::string labelValueStdString = signatures[labelndx].at("label").get<std::string>();
+                wxString labelValue(labelValueStdString.c_str(), wxConvUTF8);
                 v_SignatureTokens.push_back(cbCodeCompletionPlugin::CCCallTip(labelValue));
         }
 


I'm not sure the second hunk is needed, but the first hunk is the true fix.

@ollydbg

Would you test Clangd_Client rev 82 or the current nightly clangd_client to see if it solves the tooltip problem?

I've applied your fix for every clangd json std::string reference.
Example:
Code
 
idValue = GetwxUTF8Str(pJson->at("id").get<std::string>());

GetwxUTFStr is defined as:
Code
        wxString GetwxUTF8Str(const std::string stdString)
        {
            return wxString(stdString.c_str(), wxConvUTF8);
        }


Let me know if it works and thanks for testing.

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5910
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: Code completion using LSP and clangd
« Reply #201 on: October 04, 2022, 02:43:39 am »
Hi, pecan, thanks for the fix.

rev 82 works OK.

BTW, Is it possible to show the doxygen document in the tip window? Maybe, clangd already send to us? Thanks.

EDIT:

It looks like this clangd issue in github is related: Doxygen parsing missing Issue #529 clangd/clangd
« Last Edit: October 04, 2022, 08:32:48 am by ollydbg »
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5910
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: Code completion using LSP and clangd
« Reply #202 on: October 04, 2022, 09:11:14 am »
I did some extra test of how to show the comments.

Here is the log file from CBclangd_client-xxxxx.log:

Code
{"id":"textDocument/hover","jsonrpc":"2.0","method":"textDocument/hover","params":{"position":{"character":4,"line":3},"textDocument":{"uri":"file:///F:/code/test_clangd_client_tipwin/main.cpp"}}}

15:06:48.772 >>> readJson() len:220:
{"id":"textDocument/hover","jsonrpc":"2.0","result":{"contents":{"kind":"plaintext","value":"variable m_AAA\n\nType: int\nABCDEFG\n\nint m_AAA"},"range":{"end":{"character":9,"line":3},"start":{"character":4,"line":3}}}}

15:07:41.294 <<< Hover:
file:///F:/code/test_clangd_client_tipwin/main.cpp,line[1], char[4]

15:07:41.294 <<< Content-Length: 196



{"id":"textDocument/hover","jsonrpc":"2.0","method":"textDocument/hover","params":{"position":{"character":4,"line":1},"textDocument":{"uri":"file:///F:/code/test_clangd_client_tipwin/main.cpp"}}}

15:07:41.524 >>> readJson() len:240:
{"id":"textDocument/hover","jsonrpc":"2.0","result":{"contents":{"kind":"plaintext","value":"variable m_TcpFile\n\nType: int\nTCP鎺ユ敹鐨勬暟鎹甛n\nint m_TcpFile"},"range":{"end":{"character":13,"line":1},"start":{"character":4,"line":1}}}}

and here is the test code:
Code

int m_TcpFile;  ///< TCP接收的数据

int m_AAA; ///< ABCDEFG

int main()
{
    return 0;
}




You can see, when I hover on the variable "m_TcpFile", the received json contains some wrong contents, I'm not sure why it is not shown in Chinese.
While, for the variable "m_AAA", it shows the "ABCDEFG" correctly.


EDIT

It looks like we just drop the text returned from clangd, after the second \n

Here is the screen shot(in attachment)

The source code looks like in parser.cpp line 2553

Code
        // Example Hover contents: L"instance-method HelloWxWorldFrame::OnAbout\n\nType: void\nParameters:\n- wxCommandEvent & event\n\n// In HelloWxWorldFrame\nprivate: void HelloWxWorldFrame::OnAbout(wxCommandEvent &event)"
        // get string array of hover info separated at /n chars.
        wxString hoverString = contentsValue;
        hoverString.Replace("\n\n", "\n"); //remove double newline chars
        wxArrayString vHoverInfo = GetArrayFromString(hoverString, "\n");

        // **Debugging**
        // LogManager* pLogMgr = Manager::Get()->GetLogManager();
        //    for (size_t ii=0; ii<vHoverInfo.size(); ++ii)
        //        pLogMgr->DebugLog(wxString::Format("vHoverInfo[%d]:%s", int(ii), vHoverInfo[ii]));

        // Find items from hover data and cut the chaff
        wxString hoverText;
        for (size_t ii=0, foundIn=false; ii<vHoverInfo.size(); ++ii)
        {
            if (ii < 2) hoverText += vHoverInfo[ii] + "\n"; //type and return value
            if (vHoverInfo[ii].StartsWith("// In "))    //parent
            {
                hoverText += vHoverInfo[ii] += "\n";
                foundIn = true; continue;
            }
            if (foundIn) hoverText += vHoverInfo[ii] + "\n";;
        }//endfor vHoverInfo

        v_HoverTokens.push_back(ClgdCCToken(0, hoverText, hoverText));
« Last Edit: October 04, 2022, 09:31:47 am by ollydbg »
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline MaxGaspa

  • Multiple posting newcomer
  • *
  • Posts: 29
Re: Code completion using LSP and clangd
« Reply #203 on: October 04, 2022, 11:27:44 am »
@pecan,

I observed an unexpected behavior (i would say a bug) using the latest nightly and latest plugin. I'm observing the issue using both win7 SP1 or Win10 Enterprise.

After I opened a project the plugin started to parse the files starting from "62 more" ....in the meantime I launch the related executable (compiled before). The parser stopped. During the stopping phase the parser is parsing some files without updating the remaining file counter (repeating "59 more")

After closing the executable, the parser didn't restarted, It stayed sleeping. I used the option "Reparse the project" but the new parser was not starting. I opened one of the project's file (Main.cpp) and the parser parsed the opened file (restarting the remaining file counter from 66, not 62!!!! ( the previous starting number)  but then stopped.

I closed Main.cpp and used several times "reparse the project" without success. I closed the project and the reopened it....the parser didn't start.

Then if I close CB and running a new CB loading the same projects used before I got the parser correctly parsing all files form 66 to 0.

I'm attaching the log from CB.

Hope this helps to reproduce the issue (It seems reproducible). I'm using 0 threads while compiling (and running?) and 6 threads concurrently parsing.

It seems there is some "memory" of the past parsing even if I delete the parser....

Max
« Last Edit: October 04, 2022, 03:34:33 pm by MaxGaspa »

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5910
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: Code completion using LSP and clangd
« Reply #204 on: October 04, 2022, 01:21:01 pm »
I created a patch which can show the "doxygen comments".

Code
 clangd_client/src/codecompletion/parser/parser.cpp | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/clangd_client/src/codecompletion/parser/parser.cpp b/clangd_client/src/codecompletion/parser/parser.cpp
index 6852ee2..4706432 100644
--- a/clangd_client/src/codecompletion/parser/parser.cpp
+++ b/clangd_client/src/codecompletion/parser/parser.cpp
@@ -2563,13 +2563,14 @@ void Parser::OnLSP_HoverResponse(wxCommandEvent& event, std::vector<ClgdCCToken>
         wxString hoverText;
         for (size_t ii=0, foundIn=false; ii<vHoverInfo.size(); ++ii)
         {
-            if (ii < 2) hoverText += vHoverInfo[ii] + "\n"; //type and return value
             if (vHoverInfo[ii].StartsWith("// In "))    //parent
             {
                 hoverText += vHoverInfo[ii] += "\n";
                 foundIn = true; continue;
             }
-            if (foundIn) hoverText += vHoverInfo[ii] + "\n";;
+            if (foundIn) hoverText += vHoverInfo[ii] + "\n";
+
+            if (ii < 3) hoverText += vHoverInfo[ii] + "\n"; //type and return value  [0]: kind, [1]: type, [2]: comments
         }//endfor vHoverInfo
 
         v_HoverTokens.push_back(ClgdCCToken(0, hoverText, hoverText));


Normally, I see that

Code
variable m_AAA\n\nType: int\nABCDEFG\n\nint m_AAA

There are 3 sections separated by "\n\n", first: "variable m_AAA", second: "Type: int\nABCDEFG", and third: "int m_AAA". Basically the first two sections are needed.

There is two ";" in the original statement, this should be fixed.
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline MaxGaspa

  • Multiple posting newcomer
  • *
  • Posts: 29
Re: Code completion using LSP and clangd
« Reply #205 on: October 05, 2022, 02:46:09 pm »
@pecan

I'm observing another issue using the clangd plugin.

Using a std::vector the list of functions shown are listed in alphabetical order but it seems there is a maximum number of list members. Look at the attached images, all the functions after max_size() are not listed. For example push_back() is not listed but if I write "push" after the dot the plugin is correctly suggesting push_back (look at the attached images).

So it seems that the plugin knows the function push_back() exists but is not showing in the full list.

Is there a way to increase the number of items in the list? Is that limitation intentional?

« Last Edit: October 05, 2022, 04:46:21 pm by MaxGaspa »

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5910
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: Code completion using LSP and clangd
« Reply #206 on: October 06, 2022, 03:36:18 am »
I did some extra test of how to show the comments.

Here is the log file from CBclangd_client-xxxxx.log:

Code
...

15:07:41.524 >>> readJson() len:240:
{"id":"textDocument/hover","jsonrpc":"2.0","result":{"contents":{"kind":"plaintext","value":"variable m_TcpFile\n\nType: int\nTCP鎺ユ敹鐨勬暟鎹甛n\nint m_TcpFile"},"range":{"end":{"character":13,"line":1},"start":{"character":4,"line":1}}}}


The log file shows the wrong Chinese words.

Code
 int\nTCP鎺ユ敹鐨勬暟鎹甛n\n

The following patch solves this issue:

Code
 clangd_client/src/LSPclient/client.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/clangd_client/src/LSPclient/client.cpp b/clangd_client/src/LSPclient/client.cpp
index c4a7729..76d19ca 100644
--- a/clangd_client/src/LSPclient/client.cpp
+++ b/clangd_client/src/LSPclient/client.cpp
@@ -1020,7 +1020,7 @@ bool ProcessLanguageClient::readJson(json &json)
     m_MutexInputBufGuard.Unlock();
 
     if (stdStrInputbuf.size())
-        writeClientLog(wxString::Format(">>> readJson() len:%d:\n%s", length, stdStrInputbuf.c_str()) );
+        writeClientLog(wxString::Format(">>> readJson() len:%d:\n%s", length, GetwxUTF8Str(stdStrInputbuf.c_str()).wx_str()) );
 
     // remove any invalid utf8 chars
     bool validData = DoValidateUTF8data(stdStrInputbuf);
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline ollydbg

  • Developer
  • Lives here!
  • *****
  • Posts: 5910
  • OpenCV and Robotics
    • Chinese OpenCV forum moderator
Re: Code completion using LSP and clangd
« Reply #207 on: October 06, 2022, 05:48:35 am »
I created a patch which can show the "doxygen comments".
...

I think it again, and I think using the original text from the hover message is good enough. I think we don't need to "cut the chaff".

wxString hoverString = contentsValue;

Just show this, and I tested this method, and it works fine.
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

Offline MaxGaspa

  • Multiple posting newcomer
  • *
  • Posts: 29
Re: Code completion using LSP and clangd
« Reply #208 on: October 09, 2022, 11:40:42 pm »
@pecan

About my replay #203....

I read the message in which you said you are unable to replicate the issue. I was about to create a test project but it seems that your messagewas deleted. Do you replicated the issue?
« Last Edit: October 09, 2022, 11:45:41 pm by MaxGaspa »

Offline Pecan

  • Plugin developer
  • Lives here!
  • ****
  • Posts: 2750
Re: Code completion using LSP and clangd
« Reply #209 on: October 10, 2022, 06:41:29 am »
@pecan

About my replay #203....

I read the message in which you said you are unable to replicate the issue. I was about to create a test project but it seems that your messagewas deleted. Do you replicated the issue?
Yes, I was finally able to replicate the issue and fix it in the new Nightly 221008.
https://forums.codeblocks.org/index.php/topic,25130.msg171351/topicseen.html#msg171351

To change the number of matched completions displayed use Settings/Editor/clangd_client/Maximum allowed code-completion matches.
I suggest you be conservative since the matches are cached, ie., taking up memory until new completions are requested.

Thanks for catching this and for testing.

« Last Edit: October 10, 2022, 06:44:24 am by Pecan »