On Thu, Feb 17, 2022 at 03:53:41AM +0530, Kumar Kartikeya Dwivedi wrote: > > > > > > > > > > Good point. I did think about this problem. Right now when you load a referenced > > > pointer from the BPF map, we mark it as PTR_UNTRUSTED, so you can dereference > > > > you mean UNreferenced pointer, right? Since that's what your patch is doing. > > > > You caught a bug :). Instead of... > > else if (!ref_ptr) > reg_flags |= PTR_UNTRUSTED; > > it should simply be 'else', then for non-percpu BTF ID pointer and non-user BTF > ID pointer, only XCHG (from referenced pointer) unsets PTR_UNTRUSTED, both > CMPXCHG and LDX have untrusted bit set during mark_btf_ld_reg. btw please drop CMPXCHG for now. The "} else if (is_cmpxchg_insn(insn)) {" part looks incomplete. The verifier cannot support acquire/release semantics, so the value of cmpxchg is dubious. Let's brainstorm after the main pieces land. > > > > > > It is already marked PTR_UNTRUSTED for sleepable too, IIUC, but based on above I > > > guess we can remove the flag for objects which are RCU protected, in case of > > > non-sleepable progs. > > > > Probably not. Even non-sleepable prog under rcu_read_lock shouldn't be able > > to pass 'struct tcp_sock *' from the map into helpers. > > That sock can have refcnt==0. Even under rcu it's dangerous. > > > > Good point, so instead of helpers doing the inc_not_zero each time, prog can do > it once and pass into other helpers. The kptr_get helper will do it once too. So it's the same overhead. > > > We also need to prevent the kptr_get helper from being > > > called from sleepable progs. > > > > why is that? > > My confusion comes because I don't think that normal call_rcu/synchronize_rcu > grace period for the object waits for RCU trace readers. In sleepable prog, that > same xchg+release can happen while we hold a untrusted pointer from LDX in the > RCU trace section. Then I think that kptr_get may be wrong, because object may > already have been freed without waiting for sleepable prog to call > rcu_read_unlock_trace. Right. If we go with the helper approach we can do: BPF_CALL_2(bpf_kptr_get, void **, kptr, int, refcnt_off) { void *ptr; rcu_read_lock(); ptr = READ_ONCE(kptr); if (!ptr) goto out; /* ptr->refcnt could be == 0 if another cpu did * ptr2 = bpf_kptr_xchg(); * bpf_*_release(ptr2); */ if (!refcount_inc_not_zero((refcount_t *)(ptr + refcnt_off))) goto out; rcu_read_unlock(); return (long) ptr; out: rcu_read_unlock(); return 0; } and it will work in sleepable progs. But if we allow direct LDX to return PTR_TO_BTF_ID | PTR_UNTRUSTED then we need to support explicit bpf_rcu_read_lock|unlock() done inside bpf prog which is unnecessary overhead for everyone put PREEMPT=y. Plus a bunch of additional complexity in the verifier to support explicit RCU section. Pros and cons. Anyway post your patches and will get it from there. I've been thinking about obj = bpf_obj_alloc(bpf_core_type_id_local(struct my_obj), BPF_TYPE_ID_LOCAL); ... bpf_obj_put(obj); on top of refcnted ptr_to_btf_id stored into maps. That's another huge discussion.