On Thu, Feb 3, 2022 at 1:17 PM Mauricio Vásquez Bernal <mauricio@xxxxxxxxxx> wrote: > > On Wed, Feb 2, 2022 at 2:50 PM Andrii Nakryiko > <andrii.nakryiko@xxxxxxxxx> wrote: > > > > On Fri, Jan 28, 2022 at 3:23 PM Mauricio Vásquez Bernal > > <mauricio@xxxxxxxxxx> wrote: > > > > > > On Fri, Jan 28, 2022 at 5:33 PM Mauricio Vásquez <mauricio@xxxxxxxxxx> wrote: > > > > > > > > This commit implements some integration tests for "BTFGen". The goal > > > > of such tests is to verify that the generated BTF file contains the > > > > expected types. > > > > > > > > > > This is not an exhaustive list of test cases. I'm not sure if this is > > > the approach we should follow to implement such tests, it seems to me > > > that checking each generated BTF file by hand is a lot of work but I > > > don't have other ideas to simplify it. > > > > > > I considered different options to write these tests: > > > 1. Use core_reloc_types.h to create a "source" BTF file with a lot of > > > types, then run BTFGen for all test_core_reloc_*.o files and use the > > > generated BTF file as btf_src_file in core_reloc.c. In other words, > > > re-run all test_core_reloc tests using a generated BTF file as source > > > instead of the "btf__core_reloc_" #name ".o" one. I think this test is > > > great because it tests the full functionality and actually checks that > > > the programs are able to run using the generated file. The problem is > > > how do we test that the BTFGen is creating an optimized file? Just > > > copying the source file without any modification will make all those > > > tests pass. We could check that the generated file is small (by > > > checking the size or the number of types) but it doesn't seem a very > > > reliable approach to me. > > > > I think this second run after minimizing BTF is a good idea. I > > wouldn't bother to check for "minimal BTF" for this case. > > > > Right. Do you want this to be part of this series or can we merge later on? Can be done separately, probably, but it shouldn't be a lot of code (just call bpftool and use its output as input to core_reloc tests again). Should be a generic part of core_reloc.c selftest. > > > > 2. We could write some .c files with the types we expect to have on > > > the generated file and compare it with the generated file. The issue > > > here is that comparing those BTF files doesn't seem to be too > > > trivial... > > > > But I would add few realistic examples that use various combinations > > of CO-RE relocations against Linux types. Then minimize BTF and > > validate that BTF is what we expect. > > > > What do you mean by "realistic examples"? Aren't the BPF programs > (that use core_reloc_types.h) I added in this commit good enough for > this test? Realistic as in using real kernel types. core_reloc_types.h are synthetic. Both are useful, but seeing big task_struct compressed to just few relevant fields and CO-RE relocations still working would be very convincing. > > > As for how to compare BTFs. I've been wanting to do something like > > btf__normalize() API to renumber and resort all the BTF types into > > some "canonical" order, so that two BTFs can be actually compared and > > diffed. It might be finally the time to do that. > > > > The big complication is your decision to dump all the fields of types > > that are used by type-based relocations. I'm not convinced that's the > > best way to do this. I'd keep empty struct/union for such cases, > > actually. That would minimize the number of types and thus BTF in > > general. It also will simplify the logic of emitting minimized BTF a > > bit (all_types checks won't be necessary, I think). > > > > As I also mentioned in previous patches, for types that are only > > referenced through pointer, I'd emit FWD declaration only. Or at best > > empty struct/union. > > > > With all that, after btf__minimize() operation, comparing BTFs would > > be actually pretty easy, because we'll know the order of each type, > > Why do we know the order of each type? I think the order of the types > in the generated BTF files depends on: > 1. The order or the relocations on the BPF object. (I'm not sure if > the compiler generates them in the same order as they appear in the > code) > 2. BTFGen implementation: types are added recursively and there is > also a hashmap in between. > 3. How bpftool is invoked. bpftool gen min_core_btf ... OBJ1 OBJ2 vs > bpftool gen min_core_btf ... OBJ2 OBJ1. > Hm.. the order of CO-RE relocations doesn't have any bearing on resulting BTF. btf__normalize() will make sure that order is "canonical", which makes two BTFs comparable. E.g., all INTs will go before STRUCTS, and within INTs they will be sorted by name, then size, etc. That kind of an idea. So equivalent BTFs after normalization will end up with exactly the same contents (except, probably, for the string section, but we won't be comparing name_off as integers, rather strings that they point to; details). > What I'm saying is that given a source BTF file and a BPF object I > don't know what is the order of the output BTF file. I know we could > run the test, check the generated output and use it for the test but > it seems like "cheating" to me... > > Am I missing something? Probably that the whole point of btf__normalize() is to ensure canonical order, see above. > > > so > > using the > > VALIDATE_RAW_BTF() (see prog_tests/btf_dedup_split.c) the tests will > > be easy and clean. > > > > > > One last thing, let's not add a new test binary (test_bpftool), let's > > keep adding more tests into test_progs. > > > > Will do. > > > > > > > Do you have any suggestions about it? Thanks!