> On Mon, Dec 26, 2022 at 7:35 PM Stanislav Fomichev <stfomichev@xxxxxxxxx> wrote: > > > > > On Fri, Dec 23, 2022 at 5:49 PM Stanislav Fomichev <sdf@xxxxxxxxxx> wrote: > > > get_func_ip() */ > > > > > - tstamp_type_access:1; /* Accessed __sk_buff->tstamp_type */ > > > > > + tstamp_type_access:1, /* Accessed __sk_buff->tstamp_type */ > > > > > + valid_id:1; /* Is bpf_prog::aux::__id valid? */ > > > > > enum bpf_prog_type type; /* Type of BPF program */ > > > > > enum bpf_attach_type expected_attach_type; /* For some prog types */ > > > > > u32 len; /* Number of filter blocks */ > > > > > @@ -1688,6 +1689,12 @@ void bpf_prog_inc(struct bpf_prog *prog); > > > > > struct bpf_prog * __must_check bpf_prog_inc_not_zero(struct bpf_prog *prog); > > > > > void bpf_prog_put(struct bpf_prog *prog); > > > > > > > > > > +static inline u32 bpf_prog_get_id(const struct bpf_prog *prog) > > > > > +{ > > > > > + if (WARN(!prog->valid_id, "Attempting to use an invalid eBPF program")) > > > > > + return 0; > > > > > + return prog->aux->__id; > > > > > +} > > > > > > > > I'm still missing why we need to have this WARN and have a check at all. > > > > IIUC, we're actually too eager in resetting the id to 0, and need to > > > > keep that stale id around at least for perf/audit. > > > > Why not have a flag only to protect against double-idr_remove > > > > bpf_prog_free_id and keep the rest as is? > > > > Which places are we concerned about that used to report id=0 but now > > > > would report stale id? > > > > > > What double-idr_remove are you concerned about? > > > bpf_prog_by_id() is doing bpf_prog_inc_not_zero > > > while __bpf_prog_put just dropped it to zero. > > > > (traveling, sending from an untested setup, hope it reaches everyone) > > > > There is a call to bpf_prog_free_id from __bpf_prog_offload_destroy which > > tries to make offloaded program disappear from the idr when the netdev > > goes offline. So I'm assuming that '!prog->aux->id' check in bpf_prog_free_id > > is to handle that case where we do bpf_prog_free_id much earlier than the > > rest of the __bpf_prog_put stuff. > > That remove was done in > commit ad8ad79f4f60 ("bpf: offload: free program id when device disappears") > Back in 2017 there was no bpf audit and no > PERF_BPF_EVENT_PROG_LOAD/UNLOAD events. > > The removal of id made sense back then to avoid showing this > 'useless' orphaned offloaded prog in 'bpftool prog show', > but with addition of perf load/unload and audit it was no longer > correct to zero out ID in a prog which refcnt is still not zero. > > So we should simply remove bpf_prog_free_id from __bpf_prog_offload_destroy. > There won't be any adverse effect other than bpftool prog show > will show orphaned progs. SGTM, that would simplify everything.. > > > > > Maybe just move bpf_prog_free_id() into bpf_prog_put_deferred() > > > after perf_event_bpf_event and bpf_audit_prog ? > > > Probably can remove the obsolete do_idr_lock bool flag as > > > separate patch? > > > > +1 on removing do_idr_lock separately. > > > > > Much simpler fix and no code churn. > > > Both valid_id and saved_id approaches have flaws. > > > > Given the __bpf_prog_offload_destroy path above, we still probably need > > some flag to indicate that the id has been already removed from the idr? > > No. ID should be valid until prog went through perf and audit unload > events. Don't know about audit, but for perf it's essential to have > valid ID otherwise perf record will not be able to properly associate events.