On Thu, Oct 10, 2024 at 01:56:44PM -0700, Andrii Nakryiko wrote: > Suggested-by: Matthew Wilcox <willy@xxxxxxxxxxxxx> I'm fairly sure I've suggested much the same :-) > Signed-off-by: Andrii Nakryiko <andrii@xxxxxxxxxx> > --- > kernel/events/uprobes.c | 50 +++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 50 insertions(+) > > diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c > index fa1024aad6c4..9dc6e78975c9 100644 > --- a/kernel/events/uprobes.c > +++ b/kernel/events/uprobes.c > @@ -2047,6 +2047,52 @@ static int is_trap_at_addr(struct mm_struct *mm, unsigned long vaddr) > return is_trap_insn(&opcode); > } > > +static struct uprobe *find_active_uprobe_speculative(unsigned long bp_vaddr) > +{ > + struct mm_struct *mm = current->mm; > + struct uprobe *uprobe = NULL; > + struct vm_area_struct *vma; > + struct file *vm_file; > + struct inode *vm_inode; > + unsigned long vm_pgoff, vm_start; > + loff_t offset; > + long seq; > + > + guard(rcu)(); > + > + if (!mmap_lock_speculation_start(mm, &seq)) > + return NULL; So traditional seqcount assumed non-preemptible lock sides and would spin-wait for the LSB to clear, but for PREEMPT_RT we added preemptible seqcount support and that takes the lock to wait, which in this case is exactly the same as returning NULL and doing the lookup holding mmap_lock, so yeah. > + > + vma = vma_lookup(mm, bp_vaddr); > + if (!vma) > + return NULL; > + > + /* vm_file memory can be reused for another instance of struct file, Comment style nit. > + * but can't be freed from under us, so it's safe to read fields from > + * it, even if the values are some garbage values; ultimately > + * find_uprobe_rcu() + mmap_lock_speculation_end() check will ensure > + * that whatever we speculatively found is correct > + */ > + vm_file = READ_ONCE(vma->vm_file); > + if (!vm_file) > + return NULL; > + > + vm_pgoff = data_race(vma->vm_pgoff); > + vm_start = data_race(vma->vm_start); > + vm_inode = data_race(vm_file->f_inode); So... seqcount has kcsan annotations other than data_race(). I suppose this works, but it all feels like a bad copy with random changes. > + > + offset = (loff_t)(vm_pgoff << PAGE_SHIFT) + (bp_vaddr - vm_start); > + uprobe = find_uprobe_rcu(vm_inode, offset); > + if (!uprobe) > + return NULL; > + > + /* now double check that nothing about MM changed */ > + if (!mmap_lock_speculation_end(mm, seq)) > + return NULL; Typically seqcount does a re-try here. > + > + return uprobe; > +}