Developer forums (C::B DEVELOPMENT STRICTLY!) > CodeCompletion redesign
What about using debugging symbols for code completion?
(1/1)
Calmarius:
I don't know if this is already discussed here but I give it a try:
Parsing C++ is extremely difficult. Also the compiler, the makefile or the build system may define extra preprocessor symbols the parser may not know about derailing the parser easily.
But the information we can use for code completion is already there inside or next to every executable: the debug information.
This debug information contains basically everything one may need for code completion: a complete symbol graph, with line and scope information.
So far I have experience with PDB files and DIA SDK. I believe the same graph available in DWARF files as well.
Knowing the source file and the line number we can look up the scope that line is in and we can walk the symbol graph, and we don't need to bother with parsing. And it's possible to always get the right list of symbols.
But it has some drawbacks:
- You need to compile your program to get the information.
- The editor needs to keep track changed lines to be able to correctly match the line numbers to the built binary.
- Macros and documentation comments won't be available.
- Symbols that are optimized out may not be available.
- You will see macro expanded symbol names like "MessageBoxW" instead of the preferred "MessageBox".
What's your opinions about this idea?
ollydbg:
I think it is hard to get code completion from analysing the debug info. Did you try it? As I see, even gdb debugger has very limited code complete support. Have you see some plugins like clangcc which gives really sematic code completion.
Calmarius:
I wrote this program to test with:
--- Code: ---int a;
int b;
typedef struct
{
struct
{
int a;
union
{
struct
{
int x, y, z;
} v;
float k;
} u;
} x;
} NestedStruct;
namespace whatever
{
template <class T> class Tmpl
{
T a;
T b;
};
}
using namespace whatever;
int main()
{
int c = 0;
int d = c;
volatile int a;
volatile int b;
Tmpl<int> templated;
if (a < b)
{
int itsSmaller = 666;
}
else
{
int itsLarger = 777;
}
return 0;
}
--- End code ---
Then downloaded dwarfdump and dumped the debug symbols and got this:
--- Code: ---.debug_info
COMPILE_UNIT<header overall offset = 0x00000000>:
< 0><0x0000000b> DW_TAG_compile_unit
DW_AT_producer "GNU C++ 4.8.1 -mtune=generic -march=x86-64 -g -fstack-protector"
DW_AT_language DW_LANG_C_plus_plus
DW_AT_name "main.cpp"
DW_AT_comp_dir "/home/calmarius/stuff/source/crucible"
DW_AT_low_pc 0x004004d0
DW_AT_high_pc <offset-from-lowpc>55
DW_AT_stmt_list 0x00000000
LOCAL_SYMBOLS:
< 1><0x0000002d> DW_TAG_base_type
DW_AT_byte_size 0x00000004
DW_AT_encoding DW_ATE_signed
DW_AT_name "int"
< 1><0x00000034> DW_TAG_base_type
DW_AT_byte_size 0x00000004
DW_AT_encoding DW_ATE_float
DW_AT_name "float"
< 1><0x0000003b> DW_TAG_namespace
DW_AT_name "whatever"
DW_AT_decl_file 0x00000001 /home/calmarius/stuff/source/crucible/main.cpp
DW_AT_decl_line 0x00000015
DW_AT_sibling <0x0000006b>
< 2><0x00000046> DW_TAG_class_type
DW_AT_name "Tmpl<int>"
DW_AT_byte_size 0x00000008
DW_AT_decl_file 0x00000001 /home/calmarius/stuff/source/crucible/main.cpp
DW_AT_decl_line 0x00000016
< 3><0x0000004e> DW_TAG_member
DW_AT_name "a"
DW_AT_decl_file 0x00000001 /home/calmarius/stuff/source/crucible/main.cpp
DW_AT_decl_line 0x00000018
DW_AT_type <0x0000002d>
DW_AT_data_member_location 0
< 3><0x00000058> DW_TAG_member
DW_AT_name "b"
DW_AT_decl_file 0x00000001 /home/calmarius/stuff/source/crucible/main.cpp
DW_AT_decl_line 0x00000019
DW_AT_type <0x0000002d>
DW_AT_data_member_location 4
< 3><0x00000062> DW_TAG_template_type_parameter
DW_AT_name "T"
DW_AT_type <0x0000002d>
< 1><0x0000006b> DW_TAG_imported_module
DW_AT_decl_file 0x00000001 /home/calmarius/stuff/source/crucible/main.cpp
DW_AT_decl_line 0x0000001d
DW_AT_import <0x0000003b>
< 1><0x00000072> DW_TAG_subprogram
DW_AT_external yes(1)
DW_AT_name "main"
DW_AT_decl_file 0x00000001 /home/calmarius/stuff/source/crucible/main.cpp
DW_AT_decl_line 0x0000001f
DW_AT_type <0x0000002d>
DW_AT_low_pc 0x004004d0
DW_AT_high_pc <offset-from-lowpc>55
DW_AT_frame_base len 0x0001: 9c: DW_OP_call_frame_cfa
DW_AT_GNU_all_call_sites yes(1)
DW_AT_sibling <0x00000128>
< 2><0x00000093> DW_TAG_lexical_block
DW_AT_low_pc 0x004004d4
DW_AT_high_pc <offset-from-lowpc>49
< 3><0x000000a4> DW_TAG_variable
DW_AT_name "c"
DW_AT_decl_file 0x00000001 /home/calmarius/stuff/source/crucible/main.cpp
DW_AT_decl_line 0x00000021
DW_AT_type <0x0000002d>
DW_AT_location len 0x0002: 9150: DW_OP_fbreg -48
< 3><0x000000b0> DW_TAG_variable
DW_AT_name "d"
DW_AT_decl_file 0x00000001 /home/calmarius/stuff/source/crucible/main.cpp
DW_AT_decl_line 0x00000022
DW_AT_type <0x0000002d>
DW_AT_location len 0x0002: 9154: DW_OP_fbreg -44
< 3><0x000000bc> DW_TAG_variable
DW_AT_name "a"
DW_AT_decl_file 0x00000001 /home/calmarius/stuff/source/crucible/main.cpp
DW_AT_decl_line 0x00000023
DW_AT_type <0x00000128>
DW_AT_location len 0x0002: 914c: DW_OP_fbreg -52
< 3><0x000000c8> DW_TAG_variable
DW_AT_name "b"
DW_AT_decl_file 0x00000001 /home/calmarius/stuff/source/crucible/main.cpp
DW_AT_decl_line 0x00000024
DW_AT_type <0x00000128>
DW_AT_location len 0x0002: 9160: DW_OP_fbreg -32
< 3><0x000000d4> DW_TAG_variable
DW_AT_name "templated"
DW_AT_decl_file 0x00000001 /home/calmarius/stuff/source/crucible/main.cpp
DW_AT_decl_line 0x00000025
DW_AT_type <0x00000046>
DW_AT_location len 0x0002: 9160: DW_OP_fbreg -32
< 3><0x000000e2> DW_TAG_lexical_block
DW_AT_low_pc 0x004004f0
DW_AT_high_pc <offset-from-lowpc>7
DW_AT_sibling <0x00000106>
< 4><0x000000f7> DW_TAG_variable
DW_AT_name "itsSmaller"
DW_AT_decl_file 0x00000001 /home/calmarius/stuff/source/crucible/main.cpp
DW_AT_decl_line 0x00000029
DW_AT_type <0x0000002d>
DW_AT_location len 0x0002: 9158: DW_OP_fbreg -40
< 3><0x00000106> DW_TAG_lexical_block
DW_AT_low_pc 0x004004f9
DW_AT_high_pc <offset-from-lowpc>7
< 4><0x00000117> DW_TAG_variable
DW_AT_name "itsLarger"
DW_AT_decl_file 0x00000001 /home/calmarius/stuff/source/crucible/main.cpp
DW_AT_decl_line 0x0000002d
DW_AT_type <0x0000002d>
DW_AT_location len 0x0002: 915c: DW_OP_fbreg -36
< 1><0x00000128> DW_TAG_volatile_type
DW_AT_type <0x0000002d>
< 1><0x0000012d> DW_TAG_variable
DW_AT_name "a"
DW_AT_decl_file 0x00000001 /home/calmarius/stuff/source/crucible/main.cpp
DW_AT_decl_line 0x00000001
DW_AT_type <0x0000002d>
DW_AT_external yes(1)
DW_AT_location len 0x0009: 031c10600000000000: DW_OP_addr 0x0060101c
< 1><0x00000140> DW_TAG_variable
DW_AT_name "b"
DW_AT_decl_file 0x00000001 /home/calmarius/stuff/source/crucible/main.cpp
DW_AT_decl_line 0x00000002
DW_AT_type <0x0000002d>
DW_AT_external yes(1)
DW_AT_location len 0x0009: 032010600000000000: DW_OP_addr 0x00601020
.debug_line: line number info for a single cu
Source lines (from CU-DIE at .debug_info offset 0x0000000b):
<pc> [row,col] NS BB ET PE EB IS= DI= uri: "filepath"
NS new statement, BB new basic block, ET end of text sequence
PE prologue end, EB epilogue begin
IA=val ISA number, DI=val discriminator value
0x004004d0 [ 32, 0] NS uri: "/home/calmarius/stuff/source/crucible/main.cpp"
0x004004d4 [ 33, 0] NS
0x004004db [ 34, 0] NS
0x004004e1 [ 39, 0] NS
0x004004f0 [ 41, 0] NS
0x004004f9 [ 45, 0] NS
0x00400500 [ 48, 0] NS
0x00400505 [ 49, 0] NS
0x00400507 [ 49, 0] NS ET
.debug_pubnames
.debug_macinfo
.debug_string
name at offset 0x00000000, length 37 is '/home/calmarius/stuff/source/crucible'
name at offset 0x00000026, length 10 is 'itsSmaller'
name at offset 0x00000031, length 63 is 'GNU C++ 4.8.1 -mtune=generic -march=x86-64 -g -fstack-protector'
name at offset 0x00000071, length 9 is 'templated'
name at offset 0x0000007b, length 8 is 'main.cpp'
name at offset 0x00000084, length 4 is 'main'
name at offset 0x00000089, length 9 is 'Tmpl<int>'
name at offset 0x00000093, length 9 is 'itsLarger'
name at offset 0x0000009d, length 5 is 'float'
name at offset 0x000000a3, length 8 is 'whatever'
.debug_aranges
COMPILE_UNIT<header overall offset = 0x00000000>:
< 0><0x0000000b> DW_TAG_compile_unit
DW_AT_producer "GNU C++ 4.8.1 -mtune=generic -march=x86-64 -g -fstack-protector"
DW_AT_language DW_LANG_C_plus_plus
DW_AT_name "main.cpp"
DW_AT_comp_dir "/home/calmarius/stuff/source/crucible"
DW_AT_low_pc 0x004004d0
DW_AT_high_pc <offset-from-lowpc>55
DW_AT_stmt_list 0x00000000
arange starts at 0x004004d0, length of 0x00000037, cu_die_offset = 0x0000000b
arange end
.debug_frame
.debug_static_func
.debug_static_vars
.debug_weaknames
--- End code ---
You can see that function and variable names are recorded quite well. External variables also. But unused things are stripped (you don't find NestedStruct).
You can also see the location of the declaration, extra info like, position in the struct or relative address from the stack frame.
At the end you can see an address to line map.
Of course we need some work to make a reverse map to get address from line, and need to make the lookup structure to turn addresses to symbols to find out the current scope. But that's easier to do than doing the parsing ourselves.
So it seems dwarf debug info has the same capabilities the PDB debug info has. So far it looks like a perfect candidate to base code completion on.
l_inc:
Calmarius
--- Quote ---So far it looks like a perfect candidate to base code completion on.
--- End quote ---
Considering all the disadvantages you mentioned, it's by far not perfect. Because of how laggy it would be I'd even argue it would be even less usable than the standard Code::Blocks CC plugin. This one however is almost perfect. And it already works well.
Calmarius:
Also given the problem of libraries built without debug symbols...
Probably I abandon the idea then...
Navigation
[0] Message Index
Go to full version