On Fri, Feb 22, 2019 at 2:02 PM Arnaldo Carvalho de Melo <arnaldo.melo@xxxxxxxxx> wrote: > > Em Wed, Feb 20, 2019 at 12:57:29PM -0800, Andrii Nakryiko escreveu: > > This patchset fixes the logic of determining bitfield_offsets and > > initial bit_offset when using BTF type information. It eliminates all > > the remaining discrepancies, when doing btfdiff on vmlinux image, module > > two instances of incorrectly reporting struct member type name when > > bitfield is the very first field in a struct, which is only happening > > when using DWARF data. > > > > Patch #1 makes btfdiff script easier to use during local development. > > > > Patch #2 fixes list of known base type names to handle clang-generated > > type descriptions better. > > > > Patch #3 fixes bitfield handling logic in btf_loader. > > Thanks a lot! Applied. > > And here I see no differences in my vmlinux: > <SNIP> > > Which ones where these: "module two instances of incorrectly reporting struct > member type name when bitfield is the very first field in a struct, which is > only happening when using DWARF data." ? $ ./btfdiff /tmp/vmlinux4 --- /tmp/btfdiff.dwarf.GIVfpr 2019-02-20 12:18:29.138788970 -0800 +++ /tmp/btfdiff.btf.c3x2KY 2019-02-20 12:18:29.351786365 -0800 @@ -16884,7 +16884,7 @@ struct pebs_record_nhm { }; union hsw_tsx_tuning { struct { - unsigned int cycles_last_block:32; /* 0: 0 4 */ + u32 cycles_last_block:32; /* 0: 0 4 */ u32 hle_abort:1; /* 4:31 4 */ u32 rtm_abort:1; /* 4:30 4 */ u32 instruction_abort:1; /* 4:29 4 */ @@ -26154,7 +26154,7 @@ struct acpi_device_power { /* last cacheline: 40 bytes */ }; struct acpi_device_perf_flags { - unsigned char reserved:8; /* 0: 0 1 */ + u8 reserved:8; /* 0: 0 1 */ /* size: 1, cachelines: 1, members: 1 */ /* last cacheline: 1 bytes */ For hsw_tsx_tuning, it is defined in sources like this: $ cat hsw_tsx_tuning.c typedef unsigned long long u64; typedef unsigned int u32; union hsw_tsx_tuning { struct { u32 cycles_last_block : 32, hle_abort : 1, rtm_abort : 1, instruction_abort : 1, non_instruction_abort : 1, retry : 1, data_conflict : 1, capacity_writes : 1, capacity_reads : 1; }; u64 value; }; int main() { union hsw_tsx_tuning t; return 0; } Building same defition with gcc and clang reveals some more info. $ cc -g -c hsw_tsx_tuning.c -o hsw_tsx_tuning-gcc && ~/local/pahole/build/pahole -J hsw_tsx_tuning-gcc $ clang -g -c hsw_tsx_tuning.c -o hsw_tsx_tuning-clang && ~/local/pahole/build/pahole -J hsw_tsx_tuning-clang GCC-generated DWARF/BTF exposes this bug: $ PAHOLE=~/local/pahole/build/pahole ~/local/pahole/btfdiff hsw_tsx_tuning-gcc --- /tmp/btfdiff.dwarf.khRiFg 2019-02-22 15:18:46.768858141 -0800 +++ /tmp/btfdiff.btf.jqdEsR 2019-02-22 15:18:46.770858140 -0800 @@ -1,6 +1,6 @@ union hsw_tsx_tuning { struct { - unsigned int cycles_last_block:32; /* 0: 0 4 */ + u32 cycles_last_block:32; /* 0: 0 4 */ u32 hle_abort:1; /* 4:31 4 */ u32 rtm_abort:1; /* 4:30 4 */ u32 instruction_abort:1; /* 4:29 4 */ But the one generated by clang doesn't: $ PAHOLE=~/local/pahole/build/pahole ~/local/pahole/btfdiff hsw_tsx_tuning-clang The only difference is that GCC correctly marks cycles_last_block as bitfield, while clang optimizes it to stand-alone u32. $ ~/local/pahole/build/pahole -F btf hsw_tsx_tuning-gcc union hsw_tsx_tuning { struct { u32 cycles_last_block:32; /* 0: 0 4 */ u32 hle_abort:1; /* 4:31 4 */ u32 rtm_abort:1; /* 4:30 4 */ u32 instruction_abort:1; /* 4:29 4 */ u32 non_instruction_abort:1; /* 4:28 4 */ u32 retry:1; /* 4:27 4 */ u32 data_conflict:1; /* 4:26 4 */ u32 capacity_writes:1; /* 4:25 4 */ u32 capacity_reads:1; /* 4:24 4 */ }; /* 0 8 */ u64 value; /* 0 8 */ }; $ ~/local/pahole/build/pahole -F btf hsw_tsx_tuning-clang union hsw_tsx_tuning { struct { u32 cycles_last_block; /* 0 4 */ u32 hle_abort:1; /* 4:31 4 */ u32 rtm_abort:1; /* 4:30 4 */ u32 instruction_abort:1; /* 4:29 4 */ u32 non_instruction_abort:1; /* 4:28 4 */ u32 retry:1; /* 4:27 4 */ u32 data_conflict:1; /* 4:26 4 */ u32 capacity_writes:1; /* 4:25 4 */ u32 capacity_reads:1; /* 4:24 4 */ }; /* 0 8 */ u64 value; /* 0 8 */ }; I've spent a bit of time debugging this, but didn't get far, as I'm pretty unfamiliar with overall flow of decoders, I was hoping this can give you some idea where to look, though. > > - Arnaldo