On Thu, Mar 14, 2019 at 12:44 PM Arnaldo Carvalho de Melo <arnaldo.melo@xxxxxxxxx> wrote: > > Em Thu, Mar 14, 2019 at 11:53:54AM -0700, Andrii Nakryiko escreveu: > > On Thu, Mar 14, 2019 at 10:45 AM Arnaldo Carvalho de Melo > > <arnaldo.melo@xxxxxxxxx> wrote: > > > > > > Em Wed, Mar 13, 2019 at 11:07:36PM -0700, Andrii Nakryiko escreveu: > > > > Bitfield offsets can be negative, if field "borrows" few bits from > > > > following aligned field. This causes a bunch of surprises down the line > > > > in pahole's logic (e.g., for hole calculation logic), so instead of > > > > waiting till printf routines adjust this for display, adjust them early > > > > and keep less surprising semantics. > > > > > > > +++ b/btf_loader.c > > > > @@ -482,6 +482,12 @@ static int class__fixup_btf_bitfields(struct tag *tag, struct cu *cu, struct btf > > > > pos->bitfield_offset = pos->bit_offset - pos->byte_offset * 8; > > > > if (!btfe->is_big_endian) > > > > pos->bitfield_offset = pos->bit_size - pos->bitfield_offset - pos->bitfield_size; > > > > + /* re-adjust bitfield offset if it is negative */ > > > > + if (pos->bitfield_offset < 0) { > > > > + pos->bitfield_offset += pos->bit_size; > > > > + pos->byte_offset += pos->byte_size; > > > > + pos->bit_offset = pos->byte_offset * 8 + pos->bitfield_offset; > > > > + } > > > > } else { > > > > pos->byte_offset = pos->bit_offset / 8; > > > > } > > > > > > While testing this one I noticed this output: > > > > > > [acme@quaco pahole]$ pahole -F btf -C ieee80211_tx_rate examples/vmlinux-aarch64 > > > struct ieee80211_tx_rate { > > > s8 idx; /* 0 1 */ > > > > > > /* Bitfield combined with previous fields */ > > > > > > u16 count:5; /* 0: 8 2 */ > > > u16 flags:11; /* 0:13 2 */ > > > > These offsets are wrong (pahole always prints them in big-endian > > notation, right?) > > > > It should be: > > > > u16 count:5; /* 0: 0 2 */ > > u16 flags:11; /* 2: 8 2 */ > > > > But now I'm super confused about pahole's and DWARF's assumption of > > big-endianness... (see below) > > > > > > > > /* size: 3, cachelines: 1, members: 3 */ > > > /* padding: 1 */ > > > /* last cacheline: 3 bytes */ > > > }; > > > [acme@quaco pahole]$ > > > > > > The message is confusing, as count:5 is not being combined with previous > > > fields, i.e. the previous field has 8 bits and all those bits are being > > > used by 'idx'. > > > > I think technically count is combined with idx, see DWARF analysis below. > > > > We should probably check what assembly is generated to know for sure > > if it's combined into aligned first 2 bytes, or non-aligned 2 bytes at > > offset 1. > > > > > > > > The original struct definition in include/net/mac80211.h: > > > > > > struct ieee80211_tx_rate { > > > s8 idx; > > > u16 count:5, > > > flags:11; > > > } __packed; > > > > > > So we could infer that this is a packed struct, because the next field > > > doesn't come u16 aligned, not that it is a bitfield combined with > > > previous fields. > > > > > > The DWARF info: > > > <1><27639>: Abbrev Number: 16 (DW_TAG_structure_type) > > > <2763a> DW_AT_name : (indirect string, offset: 0x19d5a): ieee80211_tx_rate > > > <2763e> DW_AT_byte_size : 3 > > > <27643> DW_AT_sibling : <0x27678> > > > <2><27647>: Abbrev Number: 19 (DW_TAG_member) > > > <27648> DW_AT_name : idx > > > <27650> DW_AT_type : <0x11d> > > > <27654> DW_AT_data_member_location: 0 > > > <2><27655>: Abbrev Number: 30 (DW_TAG_member) > > > <27656> DW_AT_name : (indirect string, offset: 0x1789d): count > > > <2765e> DW_AT_type : <0x144> > > > <27662> DW_AT_byte_size : 2 > > > <27663> DW_AT_bit_size : 5 > > > <27664> DW_AT_bit_offset : 8 > > > <27665> DW_AT_data_member_location: 0 > > > <2><27666>: Abbrev Number: 30 (DW_TAG_member) > > > <27667> DW_AT_name : (indirect string, offset: 0xc978): flags > > > <2766f> DW_AT_type : <0x144> > > > <27673> DW_AT_byte_size : 2 > > > <27674> DW_AT_bit_size : 11 > > > <27675> DW_AT_bit_offset : 5 > > > <27676> DW_AT_data_member_location: 1 > > > > > > So this entire DWARF makes sense only if all the bit_offsets are > > little-endian (and I somehow got impression that DWARF is always > > specified in big-endian notation, even for little-endian targets... > > weird). > > So: > > The vmlinux being used is the aarch64 one, that is big endian. > > [acme@quaco perf]$ file ../pahole/examples/vmlinux-aarch64 > ../pahole/examples/vmlinux-aarch64: ELF 64-bit MSB pie executable, ARM aarch64, version 1 (SYSV), statically linked, BuildID[sha1]=7fcff18ec214eee2f8c6b288fca21dc6d9771ad6, with debug_info, not stripped > [acme@quaco perf]$ file ../build/v5.0+/vmlinux > ../build/v5.0+/vmlinux: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, BuildID[sha1]=a64f366e91a0342c858bac5442dccd8a29601d9c, with debug_info, not stripped > [acme@quaco perf]$ > > But, in http://dwarfstd.org/doc/Dwarf3.pdf, page 75, we have: > > <quote> > > If the data member entry describes a bit field, then that entry has the > following attributes: > > - A DW_AT_byte_size attribute whose value (see Section 2.19) is the > number of bytes that contain an instance of the bit field and any > padding bits. The byte size attribute may be omitted if the size of > the object containing the bit field can be inferred from the type > attribute of the data member containing the bit field. > > - A DW_AT_bit_offset attribute whose value (see Section 2.19) is the > number of bits to the left of the leftmost (most significant) bit of > the bit field value. > > - A DW_AT_bit_size attribute whose value (see Section 2.19) is the > number of bits occupied by the bit field value. The location > description for a bit field calculates the address of an anonym ous > object containing the bit field. The address is relati ve to the > structure, union, or cla ss that most closely encloses the bit field > declaration. The number of bytes in this anonymous object is the value > of the byte size attribute of the bit field. The offset (in bits) fr > om the most significant bit of the anonymous object to the most > significant bit of the bit field is the value of the bit offset > attribute. > > And following it there is an example with some tables, I'll read this > more thorougly later. Thanks! I'll meditate on that as well later today :) > > > 1. idx is single byte, at offset 0 (DW_AT_data_member_location: 0) > > 2. count is part of 16-bit bitfield (DW_AT_byte_size : 2), located > > at offset 0 (DW_AT_data_member_location: 0, so bitfields appear to > > always be aligned here), at little-endian offset 8 (so is part of > > second byte), takes next 5 bits (bits 8-12 in little-endian). > > 3. flags is again part of 16-bit bitfield, with base byte offset 1 > > (DW_AT_data_member_location: 1). It has bit_offset: 5 and bit size 11, > > so takes bits 5-15 starting at byte 1, or in absolute bit > > "coordinates" within struct: 13-23. So technically, this field is part > > of another bitfield, which overlaps with the one that has count. > > > > We can represent this equivalently as bitfield starting at offset 2, > > but borrowing 3 bits from previous byte. Which will preserve > > alignedness of bitfields. That was my intent, but I was always > > interpreting DWARF values as big-endian offsets. > > > > Seems like we actually need to detect big/little endiannes for DWARF, > > similar to BTF, what do you think? > > I guess so :-) > > > > So the message has to take into account if the previous field has a > > > bitfield_size != 0, no? > > > > I honestly didn't have time to go and check the logic of detection of > > merging with prev/next bitfield, but I suspect it still has bugs, > > because it derives it from field sizes, not they bit positions. So > > there are probably some more examples that have confusing hints. > > > > > > > > I have to look at the other csets, perhaps you do this there? > > > > > > > No, I think this problem persists. I changed dwarf_loader to align > > bitfields naturally, which btf_loader does, but it might be sometimes > > misleading w.r.t. these "combined with previous/next fields". With > > DWARF we have DW_AT_data_member_location, which provides more > > information than BTF's field bit offset. > > > > > I'm just trying to test it patch by patch to see what changes after each > > > one. > > > > > > - Arnaldo > > -- > > - Arnaldo