On Wed, Nov 08, 2023 at 11:56:25AM +0100, Uladzislau Rezki wrote: > > > > > > Do you have something that can easily trigger it? I mean some proposal > > > or steps to test. Probably i should try what you wrote, regarding > > > toggling from user space. > > > > > > > I can imagine ways around this, but they are a bit ugly. They end > > > > up being things like recording a timestamp on every sysfs change to > > > > rcu_normal, and then using that timestamp to deduce whether there could > > > > possibly have been sysfs activity on rcu_normal in the meantime. > > > > > > > > It feels like it should be so easy... ;-) > > > > > > > Hmm.. Yes it requires more deep analysis :) > > > > Maybe make that WARN_ONCE() condition also test a separate Kconfig > > option that depends on both DEBUG_KERNEL and RCU_EXPERT? > > > Do you mean to introduce a new Kconfig? For example CONFIG_DEBUG_SRS: > > <snip> > config DEBUG_SRS > bool "Provide debugging asserts for normal synchronize_rcu() call" > depends on DEBUG_KERNEL && RCU_EXPERT > help > ... > <snip> Yes, in kernel/rcu/Kconfig.debug. But please use a more self-explanatory name, keeping in mind that Kconfig options are a global namespace. Please at least have an "RCU" in there somewhere. ;-) > > > > > I was thinking about read_lock()/write_lock() since we have many readers > > > > > and only one writer. But i do not really like it either. > > > > > > > > This might be a hint that we should have multiple lists, perhaps one > > > > per CPU. Or lock contention could be used to trigger the transition > > > > from a single list to multiple lists. as is done in SRCU and tasks RCU. > > > > > > > I do not consider to be a sync call as heavily used as other callbacks > > > which require several workers to handle, IMHO. From the other hand my > > > experiments show that to handle 60K-100K by NOCB gives even worse results. > > > > > > > > > > > But I bet that there are several ways to make things work. > > > > > > > Right. The main concern with read_lock()/write_lock() is a PREEMPT_RT > > > kernels where it is a rt-mutex. It would be good to avoid of using any > > > blocking in the gp-kthread since it is a gp driver. > > > > RCU is pretty low-level, so it is OK with a raw spinlock for the list > > manipulation. But only the list manipulation itself. Perhaps you are > > worried about lock contention, but in that case, there is also the issue > > of memory contention for the llist code. > > > I do not consider a lock nor memory contention as an issue here. Whereas > blocking on rt-mutex in the gp-kthread i consider as "not good to go with". > raw-spinlocks are OK, but it is a per-cpu or per-node approach which i tend > to avoid, if not, then probably per-cpu-or-node and merge everything into > one llist to offload by one worker. If you have a large enough system and a high enough rate of calls to synchronize_rcu(), something is going to break. The current llist approach will suffer from memory contention and high cache-miss rate, thus also suffering excessive CPU consumption. A sleeplock will (as you say) suffer from excessive blocking. A spinlock (including raw spinlocks on PREEMPT_RT) will suffer from excessive spinning and CPU consumption. Which is why this optimization must continue to be default-off. I agree that a change to multiple queues, perhaps up to per-CPU queueing, would be needed to eliminate the possibility of these problems and thus (hopefully!) make this safe for being a default-on option. It might even need to be dynamic, as for SRCU and Tasks RCU. But neither of these more-complex options need to be implemented in the initial version. Thanx, Paul