Hi Frederic, On Fri, Nov 13, 2020 at 01:13:15PM +0100, Frederic Weisbecker wrote: > This keeps growing up. Rest assured, most of it is debug code and sanity > checks. > > Boqun Feng found that holding rnp lock while updating the offloaded > state of an rdp isn't needed, and he was right despite my initial > reaction. The sites that read the offloaded state while holding the rnp > lock are actually protected because they read it locally in a non > preemptible context. > > So I removed the rnp lock in "rcu/nocb: De-offloading CB". And just to > make sure I'm not missing something, I added sanity checks that ensure > we always read the offloaded state in a safe way (3 last patches). > > Still passes TREE01 (but I had to fight!) > > git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git > rcu/nocb-toggle-v4 > > HEAD: 579e15efa48fb6fc4ecf14961804051f385807fe > This whole series look good to me, plus I've run a test, so far everything seems working ;-) Here is my setup for the test: I'm using a ARM64 guest (running on Hyper-V) to do the test, and the guest has 8 VCPUs. The code I'm using is v5.10-rc6 + Hyper-V ARM64 guest support [1] + your patchset (I actually did a merge from your rcu/nocb-toggle-v5 branch, because IIUC some modification for rcutorture is still in Paul's tree). I compiled with my normal configuration for ARM64 Hyper-V guest plus TREE01, boot the kernel with: ignore_loglevel rcutree.gp_preinit_delay=3 rcutree.gp_init_delay=3 rcutree.gp_cleanup_delay=3 rcu_nocbs=0-1,3-7 and run rcutorture via: modprobe rcutorture nocbs_nthreads=8 nocbs_toggle=1000 fwd_progress=0 I ran the rcutorture twice, one last for a week or so and one for a day or two and I didn't observe any problem so far. The latest test summary is: [...] rcu-torture: rtc: 00000000f794686f ver: 2226396 tfle: 0 rta: 2226397 rtaf: 0 rtf: 2226385 rtmbe: 0 rtmbkf: 0/1390141 rtbe: 0 rtbke: 0 rtbre: 0 rtbf: 0 rtb: 0 nt: 181415346 onoff: 0/0:0/0 -1,0:-1,0 0:0 (HZ=1000) barrier: 0/0:0 read-exits: 108102 nocb-toggles: 306964:306974 Is there anything I'm missing for a useful test? Do you have other setup (kernel cmdline or rcutorture parameters) that you want me to try? Regards, Boqun > Thanks, > Frederic > --- > > Frederic Weisbecker (19): > rcu/nocb: Turn enabled/offload states into a common flag > rcu/nocb: Provide basic callback offloading state machine bits > rcu/nocb: Always init segcblist on CPU up > rcu/nocb: De-offloading CB kthread > rcu/nocb: Don't deoffload an offline CPU with pending work > rcu/nocb: De-offloading GP kthread > rcu/nocb: Re-offload support > rcu/nocb: Shutdown nocb timer on de-offloading > rcu: Flush bypass before setting SEGCBLIST_SOFTIRQ_ONLY > rcu/nocb: Set SEGCBLIST_SOFTIRQ_ONLY at the very last stage of de-offloading > rcu/nocb: Only cond_resched() from actual offloaded batch processing > rcu/nocb: Process batch locally as long as offloading isn't complete > rcu/nocb: Locally accelerate callbacks as long as offloading isn't complete > tools/rcutorture: Support nocb toggle in TREE01 > rcutorture: Remove weak nocb declarations > rcutorture: Export nocb (de)offloading functions > cpu/hotplug: Add lockdep_is_cpus_held() > timer: Add timer_curr_running() > rcu/nocb: Detect unsafe checks for offloaded rdp > > > include/linux/cpu.h | 1 + > include/linux/rcu_segcblist.h | 119 +++++- > include/linux/rcupdate.h | 4 + > include/linux/timer.h | 2 + > kernel/cpu.c | 7 + > kernel/rcu/rcu_segcblist.c | 13 +- > kernel/rcu/rcu_segcblist.h | 45 ++- > kernel/rcu/rcutorture.c | 3 - > kernel/rcu/tree.c | 49 ++- > kernel/rcu/tree.h | 2 + > kernel/rcu/tree_plugin.h | 416 +++++++++++++++++++-- > kernel/time/timer.c | 13 + > .../selftests/rcutorture/configs/rcu/TREE01.boot | 4 +- > 13 files changed, 614 insertions(+), 64 deletions(-)