On 08/19, Andrii Nakryiko wrote: > > On Mon, Aug 19, 2024 at 6:41 AM Oleg Nesterov <oleg@xxxxxxxxxx> wrote: > > > > On 08/12, Andrii Nakryiko wrote: > > > > > > Avoid taking refcount on uprobe in prepare_uretprobe(), instead take > > > uretprobe-specific SRCU lock and keep it active as kernel transfers > > > control back to user space. > > ... > > > include/linux/uprobes.h | 49 ++++++- > > > kernel/events/uprobes.c | 294 ++++++++++++++++++++++++++++++++++------ > > > 2 files changed, 301 insertions(+), 42 deletions(-) > > > > Oh. To be honest I don't like this patch. > > > > I would like to know what other reviewers think, but to me it adds too many > > complications that I can't even fully understand... > > Which parts? The atomic xchg() and cmpxchg() parts? What exactly do > you feel like you don't fully understand? Heh, everything looks too complex for me ;) Say, hprobe_expire(). It is also called by ri_timer() in softirq context, right? And it does /* We lost the race, undo our refcount bump. It can drop to zero. */ put_uprobe(uprobe); How so? If the refcount goes to zero, put_uprobe() does mutex_lock(), but it must not sleep in softirq context. Or, prepare_uretprobe() which does rcu_assign_pointer(utask->return_instances, ri); if (!timer_pending(&utask->ri_timer)) mod_timer(&utask->ri_timer, ...); Suppose that the timer was pending and it was fired right before rcu_assign_pointer(). What guarantees that prepare_uretprobe() will see timer_pending() == false? rcu_assign_pointer()->smp_store_release() is a one-way barrier. This timer_pending() check may appear to happen before rcu_assign_pointer() completes. In this (yes, theoretical) case ri_timer() can miss the new return_instance, while prepare_uretprobe() can miss the necessary mod_timer(). I think this needs another mb() in between. And I can't convince myself hprobe_expire() is correct... OK, I don't fully understand the logic, but why data_race(READ_ONCE(hprobe->leased)) ? READ_ONCE() should be enough in this case? > > As I have already mentioned in the previous discussions, we can simply kill > > utask->active_uprobe. And utask->auprobe. > > I don't have anything against that, in principle, but let's benchmark > and test that thoroughly. I'm a bit uneasy about the possibility that > some arch-specific code will do container_of() on this arch_uprobe in > order to get to uprobe, Well, struct uprobe is not "exported", the arch-specific code can't do this. Oleg.