On Thu, Feb 17, 2022 at 01:19:56AM IST, Alexei Starovoitov wrote: > On Wed, Feb 16, 2022 at 11:59:54PM +0530, Kumar Kartikeya Dwivedi wrote: > > > > > Just use reg2btf_ids[base_type(reg->type)] instead? > > > > > > > > That would be incorrect I think, then we'd allow e.g. PTR_TO_TCP_SOCK_OR_NULL > > > > and treat it as PTR_TO_TCP_SOCK, while current code only wants to permit > > > > non-NULL variants for these three. > > > > > > add && !type_flag(reg->type) ? > > > > > > > Ok, we can do that. WRT Hao's suggestion, we do allow NULL for ptr_to_mem_ok > > case, so doing it for all of check_func_arg_match won't work. > > Right. > > > > But, first, please describe how you found it. > > > Tried to pass PTR_TO_BTF_ID_OR_NULL into kfunc? > > > Do you have some other changes in the verifier? > > > > Yes, was working on [0], tried to come up with a test case where verifier > > printed bad register type being passed (one marked with a new flag), but noticed > > that it would fall out of __BPF_REG_TYPE_MAX limit during kfunc check. Also, it > > seems on non-KASAN build it actually doesn't crash sometimes, depends on what > > the value at that offset is. > > > > [0]: https://github.com/kkdwivedi/linux/commits/btf-ptr-in-map > > Nice. Thanks a ton. I did a quick look. Seems to be ready to post on the list? > Can you add .c tests and post? > Thanks for the quick review! Yes, I should be able to post by this weekend if I don't run into any other problems/bugs in my code. > Only one suggestion so far: > Could you get rid of callback in btf_find_struct_field ? > The callbacks obscure the control flow and make the code harder to follow. > Do refactor btf_find_struct_field, but call btf_find_ptr_to_btf_id directly from it. > Then you wouldn't need this ugly part: > /* While callback cleans up after itself, later iterations can still > * return error without calling cb, so call free function again. > */ > if (ret < 0) > bpf_map_free_ptr_off_tab(map); > Ok, makes sense, will change. > I've been thinking about the api for refcnted PTR_TO_BTF_ID in the map too. > Regarding your issue with cmpxchg. I've come up with two helpers: > > BPF_CALL_3(bpf_kptr_try_set, void **, kptr, void *, ptr, int, refcnt_off) > { > /* ptr is ptr_to_btf_id returned from bpf_*_lookup() with ptr->refcnt >= 1 > * refcount_inc() has to be done before cmpxchg() because > * another cpu might do bpf_kptr_xchg+release. > */ > refcount_inc((refcount_t *)(ptr + refcnt_off)); > if (cmpxchg(kptr, NULL, ptr)) { > /* refcnt >= 2 here */ > refcount_dec((refcount_t *)(ptr + refcnt_off)); > return -EBUSY; > } > return 0; > } > > and > > BPF_CALL_2(bpf_kptr_get, void **, kptr, int, refcnt_off) > { > void *ptr = READ_ONCE(kptr); > > if (!ptr) > return 0; > /* ptr->refcnt could be == 0 if another cpu did > * ptr2 = bpf_kptr_xchg(); > * bpf_*_release(ptr2); > */ > if (!refcount_inc_not_zero((refcount_t *)(ptr + refcnt_off))) > return 0; > return (long) ptr; > } > Good point. I did think about this problem. Right now when you load a referenced pointer from the BPF map, we mark it as PTR_UNTRUSTED, so you can dereference but not pass into a helper like this, so my thinking was that BPF user will already have an additional check (e.g. pointer being reachable from hash table key) before accessing it, and that the xchg can happen after the other program has removed the value's visibility globally (by deleting it from table). And for RCU objects dereference of PTR_UNTRUSTED still works and reads valid data, they can also skip reading by observing ptr->refcnt == 0 for dying case. So in case of fast path as long as you don't need to pass BTF ID into helpers, you don't have to increment refcount. As you note, this is not enough, e.g. we might want to pass a referenced pointer into helpers without removing it from the map. In that case such bpf_kptr_get helper makes sense, but I think 1) It needs to be inlined, because refcount_off will only be available easily at verification time (from BTF), 2) We cannot guarantee that it keeps working for same object across kernel versions, that ties kernel into a stability guarantee reg. internal changes, 3) We can only support it for subset of objects (that have RCU protection), which we can ideally detect from BTF itself, by marking them as such, or using a flag during destructor registration (which is done for ref'd pointer case). > > and instead of recognizing xchg insn directly in the verifier > I went with the helper: > BPF_CALL_2(bpf_kptr_xchg, void **, kptr, void *, ptr) > { > /* ptr is ptr_to_btf_id returned from bpf_*_lookup() with ptr->refcnt >= 1 > * or ptr == NULL. > * returns ptr_to_btf_id with refcnt >= 1 or NULL > */ > return (long) xchg(kptr, ptr); > } > It was easier to deal with. > I don't have a strong preference for helper vs insn though. > Let's brainstorm after you post patches. > > xchg or bpf_kptr_xchg only operation seems to be too limiting. Right, one other usecase me and Toke were exploring is using this support to enable XDP queueing. We can have a normal PIFO map and allow embedding xdp_frame in it. Then xchg is used during enqueue and dequeue from the map, and user can have additional stuff in map value (like embedding both bpf_timer and xdp_frame, or extra scalars as data), which should allow more composition/flexibility. To also reduce the overhead of this xchg on fast path, we can consider relaxing atomic restriction for per-CPU maps instead, then add a bpf_move_ptr helper that is non-atomic, but ensures invariant that value in per-CPU map is either refcounted or NULL, and only in program or map at any point of time. Then the overhead of moving object ownership is also eliminated, and helper itself can be inlined to a load from the map and a store to the map (of the new pointer or NULL).