On 1/14/19 6:27 AM, Arnaldo Carvalho de Melo wrote: > Em Sat, Jan 12, 2019 at 12:00:42AM +0000, Yonghong Song escreveu: >> On 1/10/19 3:03 PM, Andrii Nakryiko wrote: >>> There appears to be problems with bitfields in both DWARF data and >>> pahole's conversion from DWARF to BTF (see test file source code and >>> pahole output at the bottom). > >>> 1. pahole believes unpacked.y1 has offset 7, which is incorrect >>> (should be 32?). >>> unpacked.y2 has offset 32, which also doesn't make sense. >>> DWARF info for unpacked.y1 and unpacked.y2 also seem wrong (how >>> do we get offset 18 for y1?). > >>> <2><63>: Abbrev Number: 3 (DW_TAG_member) >>> <64> DW_AT_name : y1 >>> <67> DW_AT_decl_file : 1 >>> <68> DW_AT_decl_line : 5 >>> <69> DW_AT_type : <0x87> >>> <6d> DW_AT_byte_size : 4 >>> <6e> DW_AT_bit_size : 7 >>> <6f> DW_AT_bit_offset : 18 >>> <70> DW_AT_data_member_location: 0 >>> <2><71>: Abbrev Number: 3 (DW_TAG_member) >>> <72> DW_AT_name : y2 >>> <75> DW_AT_decl_file : 1 >>> <76> DW_AT_decl_line : 6 >>> <77> DW_AT_type : <0x87> >>> <7b> DW_AT_byte_size : 4 >>> <7c> DW_AT_bit_size : 20 >>> <7d> DW_AT_bit_offset : 12 >>> <7e> DW_AT_data_member_location: 4 >>> >>> >>> 2. Similar problems with packed.y1 and packed.y2. packed.y2 offset >>> is negative value even in DWARF: >>> >>> <2><d2>: Abbrev Number: 6 (DW_TAG_member) >>> <d3> DW_AT_name : y2 >>> <d6> DW_AT_decl_file : 1 >>> <d7> DW_AT_decl_line : 14 >>> <d8> DW_AT_type : <0x87> >>> <dc> DW_AT_byte_size : 4 >>> <dd> DW_AT_bit_size : 20 >>> <de> DW_AT_bit_offset : -2 >>> <df> DW_AT_data_member_location: 0 >> >> Did some research on why we got -2 as bit_offset with the above packed >> data structure. >> >> First, the compiler generates correct thing. the negative >> DW_AT_bit_offset is allowed for dwarf2, but not dwarf4. > >> The "-2" here means (1). the member is crossing two >> adjacent "4" bytes boundary (DW_AT_byte_size = 4), and > > Ok, but I would expect that the DW_AT_data_member_location would be 4, > meaning, it is mostly on this one but starts two bits before. So when Unfortunately, this is not dwarf interpretation for little endian. (I wished it be as easy as big endian...). -bash-4.4$ cat t.c struct t { int a:3; int b:29; } g; -bash-4.4$ gcc -O2 -g -c t.c -bash-4.4$ llvm-dwarfdump t.o ... 0x00000027: DW_TAG_member DW_AT_name ("a") DW_AT_decl_file ("/home/yhs/tmp2/t.c") DW_AT_decl_line (2) DW_AT_type (0x00000042 "int") DW_AT_byte_size (0x04) DW_AT_bit_size (0x03) DW_AT_bit_offset (0x1d) DW_AT_data_member_location (0x00) 0x00000034: DW_TAG_member DW_AT_name ("b") DW_AT_decl_file ("/home/yhs/tmp2/t.c") DW_AT_decl_line (3) DW_AT_type (0x00000042 "int") DW_AT_byte_size (0x04) DW_AT_bit_size (0x1d) DW_AT_bit_offset (0x00) DW_AT_data_member_location (0x00) The DW_AT_data_member_location is 0 for both members. The first member offset 0x1d is from the byte_size boundary (4 bytes). The second member offset 0x0 is also from the byte_size boundary (4 bytes). For the above packed case, it merely mean that the offset -2 from the byte_size_boundary (4 bytes), but instead go into the byte, rather going opposite directions with 2 bits. An alternative way to represent with data_member_location 4, but bit_offset will have to change. old way: >>> <dc> DW_AT_byte_size : 4 >>> <dd> DW_AT_bit_size : 20 >>> <de> DW_AT_bit_offset : -2 >>> <df> DW_AT_data_member_location: 0 new way: <dc> DW_AT_byte_size : 4 <dd> DW_AT_bit_size : 20 <de> DW_AT_bit_offset : 30 <df> DW_AT_data_member_location: 4 The compiler did not generate the new way. > seeing negative DW_AT_bit_offsets we should think it starts in this > location (0 in this case), but straddles to the next, so I should say > that it is at bit offset (DW_AT_byte_size * 8 - DW_AT_bit_offset), the > user, looking at this new pahole output will realize that it has 2 bits > at the end of the current DW_AT_data_member_location (0) but has 18 bits > in the next. > > >> (2). the -2 means the the starting bit is at the "4" byte boundary - 2 >> bits. So basically it covers: >> >> | | >> V V >> [0 1 .....30 31][0 .. 17 18 ... 31] >> >> totally 20 bits. >> >> The following patch fixed the issue: >> -bash-4.4$ git diff >> diff --git a/dwarves.h b/dwarves.h >> index e6bffe8..e5f8347 100644 >> --- a/dwarves.h >> +++ b/dwarves.h >> @@ -831,7 +831,7 @@ struct class_member { >> uint32_t bit_size; >> uint32_t byte_offset; >> size_t byte_size; >> - uint8_t bitfield_offset; >> + int8_t bitfield_offset; >> uint8_t bitfield_size; >> uint8_t bit_hole; >> uint8_t bitfield_end:1; >> -bash-4.4$ >> >> With the above change: >> -bash-4.4$ ~/work/github/pahole/build/pahole -JV bitfields.o >> File bitfields.o: >> [1] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED >> [2] STRUCT packed kind_flag=1 size=5 vlen=5 >> x1 type_id=3 bitfield_size=1 bits_offset=0 >> x2 type_id=3 bitfield_size=3 bits_offset=1 >> x3 type_id=3 bitfield_size=3 bits_offset=4 >> y1 type_id=1 bitfield_size=7 bits_offset=7 >> y2 type_id=1 bitfield_size=20 bits_offset=14 >> [3] INT char size=1 bit_offset=0 nr_bits=8 encoding=(none) >> [4] STRUCT unpacked kind_flag=1 size=8 vlen=5 >> x1 type_id=3 bitfield_size=1 bits_offset=0 >> x2 type_id=3 bitfield_size=3 bits_offset=1 >> x3 type_id=3 bitfield_size=3 bits_offset=4 >> y1 type_id=1 bitfield_size=7 bits_offset=7 >> y2 type_id=1 bitfield_size=20 bits_offset=32 >> [5] UNION packed_union kind_flag=0 size=8 vlen=4 >> x1 type_id=3 bits_offset=0 >> y2 type_id=1 bits_offset=0 >> s1 type_id=2 bits_offset=0 >> s2 type_id=4 bits_offset=0 >> [6] UNION unpacked_union kind_flag=0 size=8 vlen=3 >> x1 type_id=3 bits_offset=0 >> y2 type_id=1 bits_offset=0 >> s type_id=4 bits_offset=0 >> -bash-4.4$ >> >> Will send out a patch shortly. >> >>> >>> andriin@devvm$ cat bitfields.c >>> struct unpacked{ >>> char x1: 1; >>> char x2: 3; >>> char x3: 3; >>> int y1: 7; >>> int y2: 20; >>> }; >>> >>> struct packed{ >>> char x1: 1; >>> char x2: 3; >>> char x3: 3; >>> int y1: 7; >>> int y2: 20; >>> } __attribute__((packed)); >>> >>> union packed_union { >>> char x1; >>> int y2; >>> struct packed s1; >>> struct unpacked s2; >>> }; >>> >>> union unpacked_union { >>> char x1; >>> int y2; >>> struct unpacked s; >>> } __attribute__((packed)); >>> >>> int main() { >>> struct packed s1; >>> struct unpacked s2; >>> union packed_union u1; >>> union unpacked_union u2; >>> return 0; >>> } >>> andriin@devvm$ cc -g bitfields.c -o bitfields >>> andriin@devvm$ LLVM_OBJCOPY=objcopy ~/local/pahole/build/pahole -JV bitfields >>> File bitfields: >>> [1] STRUCT unpacked kind_flag=1 size=8 vlen=5 >>> x1 type_id=2 bitfield_size=1 bits_offset=0 >>> x2 type_id=2 bitfield_size=3 bits_offset=1 >>> x3 type_id=2 bitfield_size=3 bits_offset=4 >>> y1 type_id=3 bitfield_size=7 bits_offset=7 >>> y2 type_id=3 bitfield_size=20 bits_offset=32 >>> [2] INT char size=1 bit_offset=0 nr_bits=8 encoding=(none) >>> [3] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED >>> [4] STRUCT packed kind_flag=1 size=5 vlen=5 >>> x1 type_id=2 bitfield_size=1 bits_offset=0 >>> x2 type_id=2 bitfield_size=3 bits_offset=1 >>> x3 type_id=2 bitfield_size=3 bits_offset=4 >>> y1 type_id=3 bitfield_size=7 bits_offset=7 >>> y2 type_id=3 bitfield_size=255 bits_offset=16776974 >>> [5] UNION packed_union kind_flag=0 size=8 vlen=4 >>> x1 type_id=2 bits_offset=0 >>> y2 type_id=3 bits_offset=0 >>> s1 type_id=4 bits_offset=0 >>> s2 type_id=1 bits_offset=0 >>> [6] UNION unpacked_union kind_flag=0 size=8 vlen=3 >>> x1 type_id=2 bits_offset=0 >>> y2 type_id=3 bits_offset=0 >>> s type_id=1 bits_offset=0 >>> andriin@devvm$ readelf -wi bitfields >>> Contents of the .debug_info section: >>> >>> ... >>> >>> <1><2d>: Abbrev Number: 2 (DW_TAG_structure_type) >>> <2e> DW_AT_name : (indirect string, offset: 0x4e): unpacked >>> <32> DW_AT_byte_size : 8 >>> <33> DW_AT_decl_file : 1 >>> <34> DW_AT_decl_line : 1 >>> <35> DW_AT_sibling : <0x80> >>> <2><39>: Abbrev Number: 3 (DW_TAG_member) >>> <3a> DW_AT_name : x1 >>> <3d> DW_AT_decl_file : 1 >>> <3e> DW_AT_decl_line : 2 >>> <3f> DW_AT_type : <0x80> >>> <43> DW_AT_byte_size : 1 >>> <44> DW_AT_bit_size : 1 >>> <45> DW_AT_bit_offset : 7 >>> <46> DW_AT_data_member_location: 0 >>> <2><47>: Abbrev Number: 3 (DW_TAG_member) >>> <48> DW_AT_name : x2 >>> <4b> DW_AT_decl_file : 1 >>> <4c> DW_AT_decl_line : 3 >>> <4d> DW_AT_type : <0x80> >>> <51> DW_AT_byte_size : 1 >>> <52> DW_AT_bit_size : 3 >>> <53> DW_AT_bit_offset : 4 >>> <54> DW_AT_data_member_location: 0 >>> <2><55>: Abbrev Number: 3 (DW_TAG_member) >>> <56> DW_AT_name : x3 >>> <59> DW_AT_decl_file : 1 >>> <5a> DW_AT_decl_line : 4 >>> <5b> DW_AT_type : <0x80> >>> <5f> DW_AT_byte_size : 1 >>> <60> DW_AT_bit_size : 3 >>> <61> DW_AT_bit_offset : 1 >>> <62> DW_AT_data_member_location: 0 >>> <2><63>: Abbrev Number: 3 (DW_TAG_member) >>> <64> DW_AT_name : y1 >>> <67> DW_AT_decl_file : 1 >>> <68> DW_AT_decl_line : 5 >>> <69> DW_AT_type : <0x87> >>> <6d> DW_AT_byte_size : 4 >>> <6e> DW_AT_bit_size : 7 >>> <6f> DW_AT_bit_offset : 18 >>> <70> DW_AT_data_member_location: 0 >>> <2><71>: Abbrev Number: 3 (DW_TAG_member) >>> <72> DW_AT_name : y2 >>> <75> DW_AT_decl_file : 1 >>> <76> DW_AT_decl_line : 6 >>> <77> DW_AT_type : <0x87> >>> <7b> DW_AT_byte_size : 4 >>> <7c> DW_AT_bit_size : 20 >>> <7d> DW_AT_bit_offset : 12 >>> <7e> DW_AT_data_member_location: 4 >>> <2><7f>: Abbrev Number: 0 >>> <1><80>: Abbrev Number: 4 (DW_TAG_base_type) >>> <81> DW_AT_byte_size : 1 >>> <82> DW_AT_encoding : 6 (signed char) >>> <83> DW_AT_name : (indirect string, offset: 0xb3): char >>> <1><87>: Abbrev Number: 5 (DW_TAG_base_type) >>> <88> DW_AT_byte_size : 4 >>> <89> DW_AT_encoding : 5 (signed) >>> <8a> DW_AT_name : int >>> <1><8e>: Abbrev Number: 2 (DW_TAG_structure_type) >>> <8f> DW_AT_name : (indirect string, offset: 0x50): packed >>> <93> DW_AT_byte_size : 5 >>> <94> DW_AT_decl_file : 1 >>> <95> DW_AT_decl_line : 9 >>> <96> DW_AT_sibling : <0xe1> >>> <2><9a>: Abbrev Number: 3 (DW_TAG_member) >>> <9b> DW_AT_name : x1 >>> <9e> DW_AT_decl_file : 1 >>> <9f> DW_AT_decl_line : 10 >>> <a0> DW_AT_type : <0x80> >>> <a4> DW_AT_byte_size : 1 >>> <a5> DW_AT_bit_size : 1 >>> <a6> DW_AT_bit_offset : 7 >>> <a7> DW_AT_data_member_location: 0 >>> <2><a8>: Abbrev Number: 3 (DW_TAG_member) >>> <a9> DW_AT_name : x2 >>> <ac> DW_AT_decl_file : 1 >>> <ad> DW_AT_decl_line : 11 >>> <ae> DW_AT_type : <0x80> >>> <b2> DW_AT_byte_size : 1 >>> <b3> DW_AT_bit_size : 3 >>> <b4> DW_AT_bit_offset : 4 >>> <b5> DW_AT_data_member_location: 0 >>> <2><b6>: Abbrev Number: 3 (DW_TAG_member) >>> <b7> DW_AT_name : x3 >>> <ba> DW_AT_decl_file : 1 >>> <bb> DW_AT_decl_line : 12 >>> <bc> DW_AT_type : <0x80> >>> <c0> DW_AT_byte_size : 1 >>> <c1> DW_AT_bit_size : 3 >>> <c2> DW_AT_bit_offset : 1 >>> <c3> DW_AT_data_member_location: 0 >>> <2><c4>: Abbrev Number: 3 (DW_TAG_member) >>> <c5> DW_AT_name : y1 >>> <c8> DW_AT_decl_file : 1 >>> <c9> DW_AT_decl_line : 13 >>> <ca> DW_AT_type : <0x87> >>> <ce> DW_AT_byte_size : 4 >>> <cf> DW_AT_bit_size : 7 >>> <d0> DW_AT_bit_offset : 18 >>> <d1> DW_AT_data_member_location: 0 >>> <2><d2>: Abbrev Number: 6 (DW_TAG_member) >>> <d3> DW_AT_name : y2 >>> <d6> DW_AT_decl_file : 1 >>> <d7> DW_AT_decl_line : 14 >>> <d8> DW_AT_type : <0x87> >>> <dc> DW_AT_byte_size : 4 >>> <dd> DW_AT_bit_size : 20 >>> <de> DW_AT_bit_offset : -2 >>> <df> DW_AT_data_member_location: 0 >>> <2><e0>: Abbrev Number: 0 >>> >>> ... >>> >