On Tue, Feb 25, 2025 at 3:46 AM Breno Leitao <leitao@xxxxxxxxxx> wrote: > > Hello Andrii, > > On Mon, Feb 24, 2025 at 02:23:51PM -0800, Andrii Nakryiko wrote: > > On Mon, Feb 24, 2025 at 4:23 AM Breno Leitao <leitao@xxxxxxxxxx> wrote: > > > > > > Hello Andrii, > > > > > > On Wed, Oct 23, 2024 at 09:41:59PM -0700, Andrii Nakryiko wrote: > > > > > > > > +static struct uprobe* hprobe_expire(struct hprobe *hprobe, bool get) > > > > +{ > > > > + enum hprobe_state hstate; > > > > + > > > > + /* > > > > + * return_instance's hprobe is protected by RCU. > > > > + * Underlying uprobe is itself protected from reuse by SRCU. > > > > + */ > > > > + lockdep_assert(rcu_read_lock_held() && srcu_read_lock_held(&uretprobes_srcu)); > > > > > > I am hitting this warning in d082ecbc71e9e ("Linux 6.14-rc4") on > > > aarch64. I suppose this might happen on x86 as well, but I haven't > > > tested. > > > > > > WARNING: CPU: 28 PID: 158906 at kernel/events/uprobes.c:768 hprobe_expire (kernel/events/uprobes.c:825) > > > > > > Call trace: > > > hprobe_expire (kernel/events/uprobes.c:825) (P) > > > uprobe_copy_process (kernel/events/uprobes.c:691 kernel/events/uprobes.c:2103 kernel/events/uprobes.c:2142) > > > copy_process (kernel/fork.c:2636) > > > kernel_clone (kernel/fork.c:2815) > > > __arm64_sys_clone (kernel/fork.c:? kernel/fork.c:2926 kernel/fork.c:2926) > > > invoke_syscall (arch/arm64/kernel/syscall.c:35 arch/arm64/kernel/syscall.c:49) > > > do_el0_svc (arch/arm64/kernel/syscall.c:139 arch/arm64/kernel/syscall.c:151) > > > el0_svc (arch/arm64/kernel/entry-common.c:165 arch/arm64/kernel/entry-common.c:178 arch/arm64/kernel/entry-common.c:745) > > > el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:797) > > > el0t_64_sync (arch/arm64/kernel/entry.S:600) > > > > > > I broke down that warning, and the problem is on related to > > > rcu_read_lock_held(), since RCU read lock does not seem to be held in > > > this path. > > > > > > Reading this code, RCU read lock seems to protect old hprobe, which > > > doesn't seem so. > > > > > > I am wondering if we need to protect it properly, something as: > > > > > > @@ -2089,7 +2092,9 @@ static int dup_utask(struct task_struct *t, struct uprobe_task *o_utask) > > > return -ENOMEM; > > > > > > /* if uprobe is non-NULL, we'll have an extra refcount for uprobe */ > > > + rcu_read_lock(); > > > uprobe = hprobe_expire(&o->hprobe, true); > > > + rcu_write_lock(); > > > > > > > I think this is not good enough. rcu_read_lock/unlock should be around > > the entire for loop, because, technically, that return_instance can be > > freed before we even get to hprobe_expire. > > re you suggesting that we should use an RCU read lock to protect the > "traversal" of return_instances? In other words, is it currently being > traversed unsafely, given that return_instance can be freed at any time? Yes, that's what I was suggesting initially. But reading through all this again and paging in all the context I had when I was implementing this, I think this lockdep_assert(rcu_read_lock_held()) is a false positive. For timer thread, yes, we need to keep rcu_read_lock(), because timer thread doesn't own return_instance, so it has to guarantee that memory won't go away. But dup_utask() is called for current thread's return_instances, so they can't go anywhere and RCU read lock is not necessary here. So I'm going to send a patch removing part of this lockdep_assert() and will expand the comment with the actual lifetime rules. > > > So, just like we have guard(srcu)(&uretprobes_srcu); we should have > > guard(rcu)(); > > > > Except, there is that kmemdup() hidden inside dup_return_instance(), > > so we can't really do that. > > Right. kmemdup() is using GFP_KERNEL, which might sleep, so, it cannot > be called using rcu read lock. > > Thanks > --breno