Hi Andrii, On 4/20/22 10:07, Andrii Nakryiko wrote:
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c index 128028efda64..5a64cece09f3 100644 --- a/kernel/bpf/cgroup.c +++ b/kernel/bpf/cgroup.c @@ -723,10 +723,8 @@ static int __cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog, pl->link = NULL; err = update_effective_progs(cgrp, atype); - if (err) - goto cleanup; - /* now can actually delete it from this cgroup list */ + /* now can delete it from this cgroup list */ list_del(&pl->node); kfree(pl); if (list_empty(progs)) @@ -735,12 +733,55 @@ static int __cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog, if (old_prog) bpf_prog_put(old_prog); static_branch_dec(&cgroup_bpf_enabled_key[atype]); - return 0; + + if (!err) + return 0; cleanup: - /* restore back prog or link */ - pl->prog = old_prog; - pl->link = link; + /* + * If compute_effective_progs failed with -ENOMEM, i.e. alloc for + * cgrp->bpf.inactive table failed, we can recover by removing + * the detached prog from effective table and rearranging it. + */ + if (err == -ENOMEM) { + struct bpf_prog_array_item *item; + struct bpf_prog *prog_tmp, *prog_detach, *prog_last; + struct bpf_prog_array *array; + int index = 0, index_detach = -1; + + array = cgrp->bpf.effective[atype]; + item = &array->items[0]; + + if (prog) + prog_detach = prog; + else + prog_detach = link->link.prog; + + if (!prog_detach) + return -EINVAL; + + while ((prog_tmp = READ_ONCE(item->prog))) { + if (prog_tmp == prog_detach) + index_detach = index; + item++; + index++; + prog_last = prog_tmp; + } + + /* Check if we found what's needed for removing the prog */ + if (index_detach == -1 || index_detach == index-1) + return -EINVAL; + + /* Remove the last program in the array */ + if (bpf_prog_array_delete_safe_at(array, index-1)) + return -EINVAL; + + /* and update the detached with the last just removed */ + if (bpf_prog_array_update_at(array, index_detach, prog_last)) + return -EINVAL; + + err = 0; + }
Thanks for feedback, and sorry for delay. I got pulled into something else.
There are a bunch of problems with this implementation. 1. We should do this fallback right after update_effective_progs() returns error, before we get to list_del(&pl->node) and subsequent code that does some additional things (like clearing flags and stuff). This additional code needs to run even if update_effective_progs() fails. So I suggest to extract the logic of removing program from effective prog arrays into a helper function and doing err = update_effective_progs(...); if (err) purge_effective_progs(); where purge_effective_progs() will be the logic you are adding. And it will be void function because it can't fail.
I have implemented that in v3, will send that out soon.
2. We have to update not just cgrp->bpf.effective array, but all the descendants' lists as well. See what update_effective_progs() is doing, it has css_for_each_descendant_pre() iteration. You need to do it here as well. But instead of doing compute_effective_progs() which allocates a new copy of an array we'll need to update existing array in place. 3. Not clear why you need to do both bpf_prog_array_delete_safe_at() and bpf_prog_array_update_at(), isn't delete_safe_at() enought?
I thought that we need to reshuffle the table and move the progs around, but your are right, delete_safe_at() is enough. -- Thanks, Tadeusz