Just in case I'll note a nit that this paragraph will need to be removed since the patch adding the flag is getting dropped. A non-nit which may or may not end up mattering is that the flag (which *is* set on the filep slab cache) makes things more difficult to validate. Normal RCU usage guarantees that the object itself wont be freed as long you follow the rules. However, the SLAB_TYPESAFE_BY_RCU flag weakens it significantly -- the thing at hand will always be a 'struct file', but it may get reallocated to *another* file from under you. Whether this aspect plays a role here I don't know. > +static struct uprobe *find_active_uprobe_speculative(unsigned long bp_vaddr) > +{ > + const vm_flags_t flags = VM_HUGETLB | VM_MAYEXEC | VM_MAYSHARE; > + struct mm_struct *mm = current->mm; > + struct uprobe *uprobe; > + struct vm_area_struct *vma; > + struct file *vm_file; > + struct inode *vm_inode; > + unsigned long vm_pgoff, vm_start; > + int seq; > + loff_t offset; > + > + if (!mmap_lock_speculation_start(mm, &seq)) > + return NULL; > + > + rcu_read_lock(); > + I don't think there is a correctness problem here, but entering rcu *after* deciding to speculatively do the lookup feels backwards. > + vma = vma_lookup(mm, bp_vaddr); > + if (!vma) > + goto bail; > + > + vm_file = data_race(vma->vm_file); > + if (!vm_file || (vma->vm_flags & flags) != VM_MAYEXEC) > + goto bail; > + If vma teardown is allowed to progress and the file got fput'ed... > + vm_inode = data_race(vm_file->f_inode); ... the inode can be NULL, I don't know if that's handled. More importantly though, per my previous description of SLAB_TYPESAFE_BY_RCU, by now the file could have been reallocated and the inode you did find is completely unrelated. I understand the intent is to backpedal from everything should the mm seqc change, but the above may happen to matter. > + vm_pgoff = data_race(vma->vm_pgoff); > + vm_start = data_race(vma->vm_start); > + > + offset = (loff_t)(vm_pgoff << PAGE_SHIFT) + (bp_vaddr - vm_start); > + uprobe = find_uprobe_rcu(vm_inode, offset); > + if (!uprobe) > + goto bail; > + > + /* now double check that nothing about MM changed */ > + if (!mmap_lock_speculation_end(mm, seq)) > + goto bail; This leaks the reference obtained by find_uprobe_rcu(). > + > + rcu_read_unlock(); > + > + /* happy case, we speculated successfully */ > + return uprobe; > +bail: > + rcu_read_unlock(); > + return NULL; > +} Now to some handwaving, here it is: The core of my concern is that adding more work to down_write on the mmap semaphore comes with certain side-effects and plausibly more than a sufficient speed up can be achieved without doing it. An mm-wide mechanism is just incredibly coarse-grained and it may happen to perform poorly when faced with a program which likes to mess with its address space -- the fast path is going to keep failing and only inducing *more* overhead as the code decides to down_read the mmap semaphore. Furthermore there may be work currently synchronized with down_write which perhaps can transition to "merely" down_read, but by the time it happens this and possibly other consumers expect a change in the sequence counter, messing with it. To my understanding the kernel supports parallel faults with per-vma locking. I would find it surprising if the same machinery could not be used to sort out uprobe handling above. I presume a down_read on vma around all the work would also sort out any issues concerning stability of the file or inode objects. Of course single-threaded performance would take a hit due to atomic stemming from down/up_read and parallel uprobe lookups on the same vma would also get slower, but I don't know if that's a problem for a real workload. I would not have any comments if all speed ups were achieved without modifying non-uprobe code.