On Thu, Aug 10, 2023 at 11:43:26PM -0700, Yonghong Song wrote: > > > On 8/10/23 3:04 PM, David Vernet wrote: > > Currently, if a struct_ops map is loaded with BPF_F_LINK, it must also > > define the .validate() and .update() callbacks in its corresponding > > struct bpf_struct_ops in the kernel. Enabling struct_ops link is useful > > in its own right to ensure that the map is unloaded if an application > > crashes. For example, with sched_ext, we want to automatically unload > > the host-wide scheduler if the application crashes. We would likely > > never support updating elements of a sched_ext struct_ops map, so we'd > > have to implement these callbacks showing that they _can't_ support > > element updates just to benefit from the basic lifetime management of > > struct_ops links. > > > > Let's enable struct_ops maps to work with BPF_F_LINK even if they > > haven't defined these callbacks, by assuming that a struct_ops map > > element cannot be updated by default. > > Maybe you want to add one map_flag to indicate validate/update callbacks > are optional for a struct_ops link? In this case, some struct_ops maps > can still require validate() and update(), but others can skip them? Are you proposing that a map flag be added that a user space caller can specify to say that they're OK with a struct_ops implementation not supporting .validate() and .update(), but still want to use a link to manage registration and unregistration? Assuming I'm understanding your suggestion correctly, I don't think it's what we want. Updating a struct_ops map value is arguably orthogonal to the bpf link handling registration and unregistration, so it seems confusing to require a user to specify that it's the behavior they want as there's no reason they shouldn't want it. If they mistakenly thought that update element is supposed for that struct_ops variant, they'll just get an -EOPNOTSUPP error at runtime, which seems reasonable. If a struct_ops implementation should have implemented .validate() and/or .update() and neglects to, that would just be a bug in the struct_ops implementation. Apologies if I've misunderstood your proposal, and please feel free to clarify if I have. Thanks, David > > > > > Signed-off-by: David Vernet <void@xxxxxxxxxxxxx> > > --- > > kernel/bpf/bpf_struct_ops.c | 17 +++++++++++------ > > 1 file changed, 11 insertions(+), 6 deletions(-) > > > > diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c > > index eaff04eefb31..3d2fb85186a9 100644 > > --- a/kernel/bpf/bpf_struct_ops.c > > +++ b/kernel/bpf/bpf_struct_ops.c > > @@ -509,9 +509,12 @@ static long bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key, > > } > > if (st_map->map.map_flags & BPF_F_LINK) { > > - err = st_ops->validate(kdata); > > - if (err) > > - goto reset_unlock; > > + err = 0; > > + if (st_ops->validate) { > > + err = st_ops->validate(kdata); > > + if (err) > > + goto reset_unlock; > > + } > > set_memory_rox((long)st_map->image, 1); > > /* Let bpf_link handle registration & unregistration. > > * > > @@ -663,9 +666,6 @@ static struct bpf_map *bpf_struct_ops_map_alloc(union bpf_attr *attr) > > if (attr->value_size != vt->size) > > return ERR_PTR(-EINVAL); > > - if (attr->map_flags & BPF_F_LINK && (!st_ops->validate || !st_ops->update)) > > - return ERR_PTR(-EOPNOTSUPP); > > - > > t = st_ops->type; > > st_map_size = sizeof(*st_map) + > > @@ -838,6 +838,11 @@ static int bpf_struct_ops_map_link_update(struct bpf_link *link, struct bpf_map > > goto err_out; > > } > > + if (!st_map->st_ops->update) { > > + err = -EOPNOTSUPP; > > + goto err_out; > > + } > > + > > err = st_map->st_ops->update(st_map->kvalue.data, old_st_map->kvalue.data); > > if (err) > > goto err_out;