On Wed, Oct 23, 2024 at 12:22 PM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > > On Thu, Oct 10, 2024 at 01:56:44PM -0700, Andrii Nakryiko wrote: > > > Suggested-by: Matthew Wilcox <willy@xxxxxxxxxxxxx> > > I'm fairly sure I've suggested much the same :-) I'll add another Suggested-by, didn't mean to rob anyone of credits :) > > > Signed-off-by: Andrii Nakryiko <andrii@xxxxxxxxxx> > > --- > > kernel/events/uprobes.c | 50 +++++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 50 insertions(+) > > > > diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c > > index fa1024aad6c4..9dc6e78975c9 100644 > > --- a/kernel/events/uprobes.c > > +++ b/kernel/events/uprobes.c > > @@ -2047,6 +2047,52 @@ static int is_trap_at_addr(struct mm_struct *mm, unsigned long vaddr) > > return is_trap_insn(&opcode); > > } > > > > +static struct uprobe *find_active_uprobe_speculative(unsigned long bp_vaddr) > > +{ > > + struct mm_struct *mm = current->mm; > > + struct uprobe *uprobe = NULL; > > + struct vm_area_struct *vma; > > + struct file *vm_file; > > + struct inode *vm_inode; > > + unsigned long vm_pgoff, vm_start; > > + loff_t offset; > > + long seq; > > + > > + guard(rcu)(); > > + > > + if (!mmap_lock_speculation_start(mm, &seq)) > > + return NULL; > > So traditional seqcount assumed non-preemptible lock sides and would > spin-wait for the LSB to clear, but for PREEMPT_RT we added preemptible > seqcount support and that takes the lock to wait, which in this case is > exactly the same as returning NULL and doing the lookup holding > mmap_lock, so yeah. > yep, and on configurations with CONFIG_PER_VMA_LOCK=n this will always return false > > + > > + vma = vma_lookup(mm, bp_vaddr); > > + if (!vma) > > + return NULL; > > + > > + /* vm_file memory can be reused for another instance of struct file, > > Comment style nit. mechanical memory, sorry, missed this one > > > + * but can't be freed from under us, so it's safe to read fields from > > + * it, even if the values are some garbage values; ultimately > > + * find_uprobe_rcu() + mmap_lock_speculation_end() check will ensure > > + * that whatever we speculatively found is correct > > + */ > > + vm_file = READ_ONCE(vma->vm_file); > > + if (!vm_file) > > + return NULL; > > + > > + vm_pgoff = data_race(vma->vm_pgoff); > > + vm_start = data_race(vma->vm_start); > > + vm_inode = data_race(vm_file->f_inode); > > So... seqcount has kcsan annotations other than data_race(). I suppose > this works, but it all feels like a bad copy with random changes. I'm not sure what this means... Do I need to change anything? Drop data_race()? Use READ_ONCE()? Do nothing? > > > + > > + offset = (loff_t)(vm_pgoff << PAGE_SHIFT) + (bp_vaddr - vm_start); > > + uprobe = find_uprobe_rcu(vm_inode, offset); > > + if (!uprobe) > > + return NULL; > > + > > + /* now double check that nothing about MM changed */ > > + if (!mmap_lock_speculation_end(mm, seq)) > > + return NULL; > > Typically seqcount does a re-try here. I'd like to keep it simple, we have fallback to locked version in case of a race > > > + > > + return uprobe; > > +}