On Thu, Mar 14, 2019 at 2:01 PM Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote: > > On Thu, Mar 14, 2019 at 1:22 PM Mark Wielaard <mark@xxxxxxxxx> wrote: > > > > On Thu, 2019-03-14 at 12:56 -0700, Andrii Nakryiko wrote: > > > On Thu, Mar 14, 2019 at 12:44 PM Arnaldo Carvalho de Melo > > > <arnaldo.melo@xxxxxxxxx> wrote: > > > > > > > > But, in http://dwarfstd.org/doc/Dwarf3.pdf, page 75, we have: > > > > > > > > <quote> > > > > > > > > If the data member entry describes a bit field, then that entry has the > > > > following attributes: > > > > > > > > - A DW_AT_byte_size attribute whose value (see Section 2.19) is the > > > > number of bytes that contain an instance of the bit field and any > > > > padding bits. The byte size attribute may be omitted if the size of > > > > the object containing the bit field can be inferred from the type > > > > attribute of the data member containing the bit field. > > > > > > > > - A DW_AT_bit_offset attribute whose value (see Section 2.19) is the > > > > number of bits to the left of the leftmost (most significant) bit of > > > > the bit field value. > > > > > > > > - A DW_AT_bit_size attribute whose value (see Section 2.19) is the > > > > number of bits occupied by the bit field value. The location > > > > description for a bit field calculates the address of an anonym ous > > > > object containing the bit field. The address is relati ve to the > > > > structure, union, or cla ss that most closely encloses the bit field > > > > declaration. The number of bytes in this anonymous object is the value > > > > of the byte size attribute of the bit field. The offset (in bits) fr > > > > om the most significant bit of the anonymous object to the most > > > > significant bit of the bit field is the value of the bit offset > > > > attribute. > > > > > > > > And following it there is an example with some tables, I'll read this > > > > more thorougly later. > > > > > > > > > Thanks! I'll meditate on that as well later today :) > > > > I haven't meditated on it yet, but would warn about using the now > > ancient DWARF3 spec for this. See in particular the following DWARF > > issue "Packed unaligned bit fields" resolved for DWARF4: > > http://dwarfstd.org/ShowIssue.php?issue=081130.1 > > > > You might even just want to see what DWARF5 says about it: > > http://dwarfstd.org/doc/DWARF5.pdf > > Thanks, Mark! Newer standard is indeed a bit clearer: > > <quote> > > This Standard uses the following bit numbering and direction > conventions in examples. > These conventions are for illustrative purposes and other conventions > may apply on > particular architectures. > > - For big-endian architectures, bit offsets are counted from > high-order to low-order > bits within a byte (or larger storage unit); in this case, the bit > offset identifies the > high-order bit of the object. > - For little-endian architectures, bit offsets are counted from > low-order to high-order > bits within a byte (or larger storage unit); in this case, the bit > offset identifies the > low-order bit of the object. > > In either case, the bit so identified is defined as the beginning of the object. > > </quote> > > Will go over all those calculations again today-tomorrow, while I have > all the context from yesterday debugging session still fresh in my > head. There is a lot to meditate about :) DWARF 4/5 standard is pretty clear about this example: struct S { int j:5; int k:6; int m:5; int n:8; }; According to DWARF standard (p90 for DWARF4), both little-endian and big-endian archs should have the following bit offsets: j:0 k:5 m:11 n:16 In practice, for big-endian aarch64 binary, emitted by gcc, it is like it should (j:0, k:5, m:11, n:16). For little-endian x86_64, both clang and gcc emit the following bit offsets: j:27 k:21 m:16 n:8 Same is emitted by gcc for little-endian aarch64 target. So it's sizeof(base type) - <real bit offset> - <bit size> for little-endian. This means that pahole has to care about endianness of DWARF and make according corrections. I also compiled and disassembled this test program, to check that j will actually take 5 lowest bits. And it does: $ cat dwarf_test.c struct S { int j : 5; int k : 6; int m : 5; int n : 8; }; int main() { struct S s; s.j = 1; s.k = 2; s.m = 3; s.n = 4; return 0; } $ gcc -g dwarf_test.c -o dwarf_test.gcc $ objdump -S dwarf_test.gcc <snip> int main() { 4004b2: 55 push %rbp 4004b3: 48 89 e5 mov %rsp,%rbp struct S s; s.j = 1; 4004b6: 0f b6 45 fc movzbl -0x4(%rbp),%eax 4004ba: 83 e0 e0 and $0xffffffe0,%eax <------ clear out 5 lowest bits 4004bd: 83 c8 01 or $0x1,%eax 4004c0: 88 45 fc mov %al,-0x4(%rbp) s.k = 2; 4004c3: 0f b7 45 fc movzwl -0x4(%rbp),%eax 4004c7: 66 25 1f f8 and $0xf81f,%ax <------ clear bits 5-10 (0xf81f = 1111 1000 0001 1111) 4004cb: 83 c8 40 or $0x40,%eax <------ 0x40 = 2 << 5 4004ce: 66 89 45 fc mov %ax,-0x4(%rbp) s.m = 3; 4004d2: 0f b6 45 fd movzbl -0x3(%rbp),%eax 4004d6: 83 e0 07 and $0x7,%eax 4004d9: 83 c8 18 or $0x18,%eax 4004dc: 88 45 fd mov %al,-0x3(%rbp) s.n = 4; 4004df: c6 45 fe 04 movb $0x4,-0x2(%rbp) return 0; 4004e3: b8 00 00 00 00 mov $0x0,%eax } 4004e8: 5d pop %rbp 4004e9: c3 retq 4004ea: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)