Re: [PATCH bpf-next] bpftool: introduce btf c dump sorting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 07/05/2024 22:02, Andrii Nakryiko wrote:
> On Mon, May 6, 2024 at 6:45 AM Mykyta Yatsenko
> <mykyta.yatsenko5@xxxxxxxxx> wrote:
>>
>> From: Mykyta Yatsenko <yatsenko@xxxxxxxx>
>>
>> Provide a way to sort bpftool c dump output, to simplify vmlinux.h
>> diffing and forcing more natural definitions ordering.
>>
>> Use `normalized` argument in bpftool CLI after `format c` for example:
>> ```
>> bpftool btf dump file /sys/kernel/btf/fuse format c normalized
>> ```
>>
>> Definitions are sorted by their BTF kind ranks, lexicographically and
>> typedefs are forced to go right after their base type.
>>
>> Type ranks
>>
>> Assign ranks to btf kinds (defined in function btf_type_rank) to set
>> next order:
>> 1. Anonymous enums
>> 2. Anonymous enums64
>> 3. Named enums
>> 4. Named enums64
>> 5. Trivial types typedefs (ints, then floats)
>> 6. Structs
>> 7. Unions
>> 8. Function prototypes
>> 9. Forward declarations
>>
>> Lexicographical ordering
>>
>> Definitions within the same BTF kind are ordered by their names.
>> Anonymous enums are ordered by their first element.
>>
>> Forcing typedefs to go right after their base type
>>
>> To make sure that typedefs are emitted right after their base type,
>> we build a list of type's typedefs (struct typedef_ref) and after
>> emitting type, its typedefs are emitted as well (lexicographically)
>>
>> There is a small flaw in this implementation:
>> Type dependencies are resolved by bpf lib, so when type is dumped
>> because it is a dependency, its typedefs are not output right after it,
>> as bpflib does not have the list of typedefs for a given type.
>>
>> Signed-off-by: Mykyta Yatsenko <yatsenko@xxxxxxxx>
>> ---
>>  tools/bpf/bpftool/btf.c | 264 +++++++++++++++++++++++++++++++++++++++-
>>  1 file changed, 259 insertions(+), 5 deletions(-)
>>
> 
> I applied this locally to experiment. Generated vmlinux.h for the
> production (a bit older) kernel and then for latest bpf-next/master
> kernel. And then tried diff between normalized vmlinux.h dumps and
> non-normalized.
> 
> It took a bit for the diff tool to generate, but I think diff for
> normalized vmlinux.h is actually very usable. You can see an example
> at [1]. It shows whole new types being added in front of existing
> ones. And for existing ones it shows only parts that actually changed.
> It's quite nice. And note that I used a relatively stale production
> kernel vs latest upstream bpf-next, *AND* with different (bigger)
> Kconfig. So for more incremental changes in kernel config/version the
> diff should be much slower.
> 
> I think this idea of normalizing vmlinux.h works and is useful.
> 
> Eduard, Quentin, please take a look when you get a chance.
> 
> My high-level feedback. I like the idea and it seems to work well in
> practice. I do think, though, that the current implementation is a bit
> over-engineered. I'd drop all the complexity with TYPEDEF and try to
> get almost the same behavior with a slightly different ranking
> strategy.
> 
> Tracking which types are emitted seems unnecessary btf_dumper is doing
> that already internally. So I think overall flow could be basically
> three steps:
> 
>   - precalculate/cache "sort names" and ranks;
>   - sort based on those two, construct 0-based list of types to emit
>   - just go linearly over that sorted list, call btf_dump__dump_type()
> on each one with original type ID; if the type was already emitted or
> is not the type that's emitted as an independent type (e.g.,
> FUNC_PROTO), btf_dump__dump_type() should do the right thing (do
> nothing).
> 
> Any flaws in the above proposal?
> 
>   [1] https://gist.github.com/anakryiko/cca678c8f77833d9eb99ffc102612e28

Hi, thanks for the patch - and thanks Andrii for the Cc. I didn't have
time to look at the code yet (will do), but the idea looks great.

My main question would be, how much overhead does the sorting add to the
BTF dump, and if this overhead is low, is it even worth having a
dedicated command-line keyword to trigger the sorting, or should we just
make it the default behaviour for the C-formatted dump? (Or is there any
advantage in dumping with the current, unsorted order?)

Quentin




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux