On Fri, Sep 6, 2024 at 9:58 PM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote: > > On Fri, Sep 06, 2024 at 03:12:11PM +0200, Frederic Weisbecker wrote: > > Le Thu, Sep 05, 2024 at 11:04:22PM -0700, Paul E. McKenney a écrit : > > > On Thu, Sep 05, 2024 at 08:41:02PM +0200, Frederic Weisbecker wrote: > > > > Le Thu, Sep 05, 2024 at 08:32:16PM +0200, Frederic Weisbecker a écrit : > > > > > Le Wed, Sep 04, 2024 at 06:52:36AM -0700, Paul E. McKenney a écrit : > > > > > > > Yes, I'm preparing an update for the offending patch (which has one more > > > > > > > embarassing issue while I'm going through it again). > > > > > > > > > > > > Very good, thank you! > > > > > > > > > > So my proposal for a replacement patch is this (to replace the patch > > > > > of the same name in Neeraj tree): > > > > > > > > FYI, the diffstat against the previous version of the same patch is as follows. > > > > The rationale being: > > > > > > > > 1) rdp->nocb_cb_kthread doesn't need to be protected by nocb_gp_kthread_mutex > > > > > > > > 2) Once rcuoc is parked, we really _must_ observe the callback list counter decremented > > > > after the barrier's completion. > > > > > > > > 3) This fixes another issue: rcuoc must be parked _before_ > > > > rcu_nocb_queue_toggle_rdp() is called, otherwise a nocb locked sequence > > > > within rcuoc would race with rcuog clearing SEGCBLIST_OFFLOADED concurrently, > > > > leaving the nocb locked forever. > > > > > > Thank you!!! > > > > > > Just to make sure that I understand, I apply this patch on top of > > > Neeraj's current set of branches to get the fix, correct? > > > > Exactly! > > It passes the initial tests, an hour of 200*TREE01 and a 10-minute > torture.sh (which Neeraj likely already ran a longer version of). I fired 200-minute torture.sh completed successfully at my end. - Neeraj > off a 12-hour 200*TREE01 run and a 60-minute torture.sh for overnight. > > Here is hoping! ;-) > > Thanx, Paul