On Sun, Sep 15, 2024 at 4:49 PM Oleg Nesterov <oleg@xxxxxxxxxx> wrote: > > On 09/09, Andrii Nakryiko wrote: > > > > Currently put_uprobe() might trigger mutex_lock()/mutex_unlock(), which > > makes it unsuitable to be called from more restricted context like softirq. > > > > Let's make put_uprobe() agnostic to the context in which it is called, > > and use work queue to defer the mutex-protected clean up steps. > > ... > > > +static void uprobe_free_deferred(struct work_struct *work) > > +{ > > + struct uprobe *uprobe = container_of(work, struct uprobe, work); > > + > > + /* > > + * If application munmap(exec_vma) before uprobe_unregister() > > + * gets called, we don't get a chance to remove uprobe from > > + * delayed_uprobe_list from remove_breakpoint(). Do it here. > > + */ > > + mutex_lock(&delayed_uprobe_lock); > > + delayed_uprobe_remove(uprobe, NULL); > > + mutex_unlock(&delayed_uprobe_lock); > > + > > + kfree(uprobe); > > +} > > + > > static void uprobe_free_rcu(struct rcu_head *rcu) > > { > > struct uprobe *uprobe = container_of(rcu, struct uprobe, rcu); > > > > - kfree(uprobe); > > + INIT_WORK(&uprobe->work, uprobe_free_deferred); > > + schedule_work(&uprobe->work); > > } > > This is still wrong afaics... > > If put_uprobe() can be called from softirq (after the next patch), then > put_uprobe() and all other users of uprobes_treelock should use > write_lock_bh/read_lock_bh to avoid the deadlock. Ok, I see the problem, that's unfortunate. I see three ways to handle that: 1) keep put_uprobe() as is, and instead do schedule_work() from the timer thread to postpone put_uprobe(). (but I'm not a big fan of this) 2) move uprobes_treelock part of put_uprobe() into rcu callback, I think it has no bearing on correctness, uprobe_is_active() is there already to handle races between putting uprobe and removing it from uprobes_tree (I prefer this one over #1 ) 3) you might like this the most ;) I think I can simplify hprobes_expire() from patch #2 to not need put_uprobe() at all, if I protect uprobe lifetime with non-sleepable rcu_read_lock()/rcu_read_unlock() and perform try_get_uprobe() as the very last step after cmpxchg() succeeded. I'm leaning towards #3, but #2 seems fine to me as well. > > To be honest... I simply can't force myself to even try to read 2/3 ;) I'll > try to do this later, but I am sure I will never like it, sorry. This might sound rude, but the goal here is not to make you like it :) The goal is to improve performance with minimal complexity. And I'm very open to any alternative proposals as to how to make uretprobes RCU-protected to avoid refcounting in the hot path. I think #3 proposal above will make it a bit more palatable (but there is still locklessness, cmpxchg, etc, I see no way around that, unfortunately). > > Oleg. >