On Wed, Sep 26, 2018 at 01:23:41PM -0700, Matthew Wilcox wrote: > > Hi Jason, > > You've added a rather, er, "innovative" user of the radix tree recently > in drivers/infiniband/core/uverbs_ioctl.c. innovative, oh good.. > Unfortunately I think it can dereference freed memory like so: > > CPU A: > > srcu_key = srcu_read_lock(&file->device->disassociate_srcu); > err = ib_uverbs_cmd_verbs(file, &hdr, user_hdr->attrs); > -> > slot = radix_tree_iter_lookup( > &uapi->radix, &attrs_iter, > uapi_key_obj(hdr->object_id) | > uapi_key_ioctl_method(hdr->method_id)); > > CPU B: > radix_tree_for_each_slot (slot, &uapi->radix, &iter, 0) { > kfree(rcu_dereference_protected(*slot, true)); > radix_tree_iter_delete(&uapi->radix, &iter, slot); > } > > radix_tree_iter_delete() ends up calling radix_tree_node_free() > which does a call_rcu(&node->rcu_head, radix_tree_node_rcu_free) > You're holding an srcu_read_lock(), not an rcu_read_lock(), so > as far as RCU is concerned, you're not within a grace period, and > it doesn't need to wait for you. Hmm.... This usage of the radix tree is not using RCU. The tree is created at the start of time and is used read-only until the final kref is put, then it is destroyed. The race above cannot exist because the kref on file->device will not go to zero while CPU A is running that code above. Specifically the file's kref is held by a struct file * and the fs core will hold the kref on struct file during execution of the ioctl file_operation. I think the mistake here is the misleading use of srcu_dereference(): srcu_dereference( *slot, &pbundle->bundle.ufile->device->disassociate_srcu); That should just be rcu_dereference_protected(*slot, true), I think? > I'm _really_ not a fan of the way you pull various things out of the > radix_tree_iter and keep references to them in the pbundle. Can we > figure out a better way to do this? I don't have a clear picture of > your requirements at this stage. Yes, I figured that would come up with your xarray work. All that is needed here is a 'lookup with hint' API. Something like: struct iter last_position; lookup(0x120, &last_position); lookup_hint(0x123, &last_position); lookup_hint(0x124, &last_position); Where lookup_hint is fast if last_position already points at the correct chunk. Otherwise it will search again and update last_position. The last time I looked at the xarray code I thought this would be straightforward to add as an API, and I was thinking of proposing something like the above when xarray is finally merged. Thanks, Jason