On Thu, Dec 08, 2022 at 06:27:29PM +0530, Kumar Kartikeya Dwivedi wrote: > > I don't mind using active_lock.id for invalidation, but using reg->id to > associate it with reg is a bad idea IMO, it's already preserved and set when the > object has bpf_spin_lock in it, and it's going to allow doing bpf_spin_unlock > with that non-owing ref if it has a spin lock, essentially unlocking different > spin lock if the reg->btf of already locked spin lock reg is same due to same > active_lock.id. Right. Overwriting reg->id was a bad idea. > Even if you prevent it somehow it's more confusing to overload reg->id again for > this purpose. > > It makes more sense to introduce a new nonref_obj_id instead dedicated for this > purpose, to associate it back to the reg->id of the collection it is coming from. nonref_obj_id name sounds too generic and I'm not sure that it shouldn't be connected to reg->id the way we do it for ref_obj_id. > Also, there are two cases of invalidation, one is on remove from rbtree, which > should only invalidate non-owning references into the rbtree, and one is on > unlock, which should invalidate all non-owning references. Two cases only if we're going to do invalidation on rbtree_remove. > bpf_rbtree_remove shouldn't invalidate non-owning into list protected by same > lock, but unlocking should do it for both rbtree and list non-owning refs it is > protecting. > > So it seems you will have to maintain two IDs for non-owning referneces, one for > the collection it comes from, and one for the lock region it is obtained in. Right. Like this ? collection_id = rbroot->reg->id; // to track the collection it came from active_lock_id = cur_state->active_lock.id // to track the lock region but before we proceed let me demonstrate an example where cleanup on rbtree_remove is not user friendly: bpf_spin_lock x = bpf_list_first(); if (!x) .. y = bpf_list_last(); if (!y) .. n = bpf_list_remove(x); if (!n) .. bpf_list_add_after(n, y); // we should allow this bpf_spin_unlock We don't have such apis right now. The point here that cleanup after bpf_list_remove/bpf_rbtree_remove will destroy all regs that point somewhere in the collection. This way we save run-time check in bpf_rbtree_remove, but sacrificing usability. x and y could be pointing to the same thing. In such case bpf_list_add_after() should fail in runtime after discovering that 'y' is unlinked. Similarly with bpf_rbtree_add(). Currently it cannot fail. It takes owning ref and will release it. We can mark it as KF_RELEASE and no extra verifier changes necessary. But in the future we might have failing add/insert operations on lists and rbtree. If they're failing we'd need to struggle with 'conditional release' verifier additions, the bpf prog would need to check return value, etc. I think we better deal with it in run-time. The verifier could supply bpf_list_add_after() with two hidden args: - container_of offset (delta between rb_node and begining of prog's struct) - struct btf_struct_meta *meta Then inside bpf_list_add_after or any failing KF_RELEASE kfunc it can call bpf_obj_drop_impl() that element. Then from the verifier pov the KF_RELEASE function did the release and 'owning ref' became 'non-owning ref'. > > >> And you're also adding 'untrusted' here, mainly as a result of > > >> bpf_rbtree_add(tree, node) - 'node' becoming untrusted after it's added, > > >> instead of becoming a non-owning ref. 'untrusted' would have state like: > > >> > > >> PTR_TO_BTF_ID | MEM_ALLOC (w/ rb_node type) > > >> PTR_UNTRUSTED > > >> ref_obj_id == 0? > > > > > > I'm not sure whether we really need full untrusted after going through bpf_rbtree_add() > > > or doing 'non-owning' is enough. > > > If it's full untrusted it will be: > > > PTR_TO_BTF_ID | PTR_UNTRUSTED && ref_obj_id == 0 > > > > > > > Yeah, I don't see what this "full untrusted" is giving us either. Let's have > > "cleanup non-owning refs on spin_unlock" just invalidate the regs for now, > > instead of converting to "full untrusted"? > > > > +1, I prefer invalidating completely on unlock. fine by me. > > I think it's better to clean by invalidating. We have better tools to form > untrusted pointers (like bpf_rdonly_cast) now if the BPF program writer needs > such an escape hatch for some reason. It's also easier to review where an > untrusted pointer is being used in a program, and has zero cost at runtime. ok. Since it's more strict we can relax to untrusted later if necessary. > So far I'm leaning towards: > > bpf_rbtree_add(node) : node becomes non-owned ref > bpf_spin_unlock(lock) : node is invalidated ok > > > Currently I'm leaning towards PTR_UNTRUSTED for cleanup after bpf_spin_unlock > > > and non-owning after bpf_rbtree_add. > > > > > > Walking the example from previous email: > > > > > > struct bpf_rbtree_iter it; > > > struct bpf_rb_node * node; > > > struct bpf_rb_node *n, *m; > > > > > > bpf_rbtree_iter_init(&it, rb_root); // locks the rbtree works as bpf_spin_lock > > > while ((node = bpf_rbtree_iter_next(&it)) { > > > // node -> PTR_TO_BTF_ID | MEM_ALLOC | MAYBE_NULL && ref_obj_id == 0 > > > if (node && node->field == condition) { > > > > > > n = bpf_rbtree_remove(rb_root, node); > > > if (!n) ...; > > > // n -> PTR_TO_BTF_ID | MEM_ALLOC && ref_obj_id == X > > > m = bpf_rbtree_remove(rb_root, node); // ok, but fails in run-time > > > if (!m) ...; > > > // m -> PTR_TO_BTF_ID | MEM_ALLOC && ref_obj_id == Y > > > > > This second remove I would simply disallow as Dave is suggesting during > verification, by invalidating non-owning refs for rb_root. Looks like cleanup from non-owning to untrusted|unknown on bpf_rbtree_remove is our only remaining disagreement. I feel run-time checks will be fast enough and will improve usabililty. Also it feels that not doing cleanup on rbtree_remove is simpler to implement and reason about. Here is the proposal with one new field 'active_lock_id': first = bpf_rbtree_first(root) KF_RET_NULL check_reg_allocation_locked() checks that root->reg->id == cur->active_lock.id R0 = PTR_TO_BTF_ID|MEM_ALLOC|PTR_MAYBE_NULL ref_obj_id = 0; R0->active_lock_id = root->reg->id R0->id = ++env->id_gen; which will be cleared after !NULL check inside prog. same way we can add rb_find, rb_find_first, but not rb_next, rb_prev, since they don't have 'root' argument. bpf_rbtree_add(root, node, cb); KF_RELEASE. needs to see PTR_TO_BTF_ID|MEM_ALLOC node->ref_obj_id > 0 check_reg_allocation_locked() checks that root->reg->id == cur->active_lock.id calls release_reference(node->ref_obj_id) converts 'node' to PTR_TO_BTF_ID|MEM_ALLOC ref_obj_id = 0; node->active_lock_id = root->reg->id 'node' is equivalent to 'first'. They both point to some element inside rbtree and valid inside spin_locked region. It's ok to read|write to both under lock. removed_node = bpf_rbtree_remove(root, node); KF_ACQUIRE|KF_RET_NULL need to see PTR_TO_BTF_ID|MEM_ALLOC node->ref_obj_id = 0; and usual check_reg_allocation_locked(root) R0 = PTR_TO_BTF_ID|MEM_ALLOC|MAYBE_NULL R0->ref_obj_id = R0->id = acquire_reference_state(); R0->active_lock_id should stay 0 mark_reg_unknown(node) bpf_spin_unlock(lock); checks lock->id == cur->active_lock.id for all regs in state if (reg->active_lock_id == lock->id) mark_reg_unknown(reg)