On Thu, Feb 28, 2019 at 10:19 AM Yonghong Song <yhs@xxxxxx> wrote: > > > > On 2/27/19 2:46 PM, Andrii Nakryiko wrote: > > When checking available canonical candidates for struct/union algorithm > > utilizes btf_dedup_is_equiv to determine if candidate is suitable. This > > check is not enough when candidate is corresponding FWD for that > > struct/union, because according to equivalence logic they are > > equivalent. When it so happens that FWD and STRUCT/UNION end in hashing > > to the same bucket, it's possible to create remapping loop from FWD to > > STRUCT and STRUCT to same FWD, which will cause btf_dedup() to loop > > forever. > > > > This patch fixes the issue by additionally checking that type and > > canonical candidate are strictly equal (utilizing btf_equal_struct). > > It looks like btf_equal_struct() checking equality except > member type id's. Maybe calling it btf_almost_equal_struct() or > something like that? Yes, for struct/union we can't compare types directly, that's what btf_dedup_is_equiv is doing. I think btf_equal_struct w/ comment explaining this particular behavior is good enough. If you insist, though, I'd rather go to something like btf_shallow_equal_struct or something along those lines. > > > > > Fixes: d5caef5b5655 ("btf: add BTF types deduplication algorithm") > > Reported-by: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx> > > Signed-off-by: Andrii Nakryiko <andriin@xxxxxx> > > --- > > tools/lib/bpf/btf.c | 6 +++++- > > 1 file changed, 5 insertions(+), 1 deletion(-) > > > > diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c > > index 6bbb710216e6..53db26d158c9 100644 > > --- a/tools/lib/bpf/btf.c > > +++ b/tools/lib/bpf/btf.c > > @@ -2255,7 +2255,7 @@ static void btf_dedup_merge_hypot_map(struct btf_dedup *d) > > static int btf_dedup_struct_type(struct btf_dedup *d, __u32 type_id) > > { > > struct btf_dedup_node *cand_node; > > - struct btf_type *t; > > + struct btf_type *cand_type, *t; > > /* if we don't find equivalent type, then we are canonical */ > > __u32 new_id = type_id; > > __u16 kind; > > @@ -2275,6 +2275,10 @@ static int btf_dedup_struct_type(struct btf_dedup *d, __u32 type_id) > > for_each_dedup_cand(d, h, cand_node) { > > int eq; > > > > + cand_type = d->btf->types[cand_node->type_id]; > > + if (!btf_equal_struct(t, cand_type)) > > The comment for this btf_equal_struct is not quite right. > /* > * Check structural compatibility of two FUNC_PROTOs, ignoring > referenced type > * IDs. This check is performed during type graph equivalence check and > * referenced types equivalence is checked separately. > */ > static bool btf_equal_struct(struct btf_type *t1, struct btf_type *t2) > > It should be two "struct/union types". Yep, good catch, will fix! > > > + continue; > > + > > I did not trace the algorithm how infinite loop happens. But the above Check the test in follow up patch. It has a minimal example that triggers this bug. It happens when we have some FWD x, which we discover that it should be resolved to some STRUCT x (as a result of equivalence check/resolution of some other struct s, that references struct x internally). But that struct x might not have been deduplicated yet, we just record this FWD -> STRUCT mapping so that we don't lose this connection. Later, once we get to deduplication of struct x, FWD x will be (in case of hash collision) one possible candidate to consider for deduplication. At that point, btf_dedup_is_equiv will consider them equivalent (but they are not equal (!), that's where the bug is), so we'll try to resolve STRUCT x -> FWD x, which creates a loop. In btf_dedup_merge_hypot_map() that is used to record discovered "equivalences" during struct/union type graph equivalence check, we have explicit check to never resolve STRUCT/UNION into equivalent FWD, so such loop shouldn't happen, except I missed the case of having FWD as a possible dedup candidate due to hash collision. > change is certainly a correct one, you want to do deduplication only > after everything else (except member types) are euqal? Well, if not for special case of FWD == STRUCT/UNION when deduplicating structs, btf_dedup_is_equiv would be enough, because it already checks for btf_equal_struct internally, when both types are struct/union. It's just the special bit at the beginning of is_equiv check that allows FWD and STRUCT/UNION with the same name to be declared equivalent, that throws this off. > > If the bug is due to circle in struct->fwd and fwd->struct mappings, > maybe a simple check whether such circle exists or not before update > the mapping will also work? I am not proposing this fix, but want > to understand better the issue. That's essentially what we use btf_equal_struct for here, really. We could equivalently just check BTF_INFO_KIND(t) == BTF_INFO_KIND(cand) explicitly, but I btf_equal_struct feels a bit more generic and obviously correct. > > > > > > btf_dedup_clear_hypot_map(d); > > eq = btf_dedup_is_equiv(d, type_id, cand_node->type_id); > > if (eq < 0) > >