On Wed, Mar 06, 2024 at 10:06:21PM -0500, Mathieu Desnoyers wrote: > On 2024-03-06 21:43, Linus Torvalds wrote: > [...] > > > > Honestly, this all makes me think that we'd be *much* better off > > showing the real "handoff" with smp_store_release() and > > smp_load_acquire(). > > We've done something similar in liburcu (userspace code) to allow > Thread Sanitizer to understand the happens-before relationships > within the RCU implementations and lock-free data structures. > > Moving to load-acquire/store-release (C11 model in our case) > allowed us to provide enough happens-before relationship for > Thread Sanitizer to understand what is happening under the > hood in liburcu and perform relevant race detection of user > code. Good point! In the kernel, though, KCSAN understands the Linux-kernel memory model, and so we don't get that sort of false positive. > As far as the WRITE_ONCE(x, READ_ONCE(x) + 1) pattern > is concerned, the only valid use-case I can think of is > split counters or RCU implementations where there is a > single updater doing the increment, and one or more > concurrent reader threads that need to snapshot a > consistent value with READ_ONCE(). It is wrong here. OK, not wrong from a functional viewpoint, but it is at best very confusing. I am applying patches to fix this. I will push out a new "dev" branch on -rcu that will make this function appear as shown below. So what would you use that pattern for? One possibility is a per-CPU statistical counter in userspace on a fastpath, in cases where losing the occasional count is OK. Then learning your CPU (and possibly being immediately migrated to some other CPU), READ_ONCE() of the count, increment, and WRITE_ONCE() might (or might not) make sense. I suppose the same in the kernel if there was a fastpath so extreme you could not afford to disable preemption. At best, very niche. Or am I suffering a failure of imagination yet again? ;-) Thanx, Paul ------------------------------------------------------------------------ static bool rcu_torture_pipe_update_one(struct rcu_torture *rp) { int i; struct rcu_torture_reader_check *rtrcp = READ_ONCE(rp->rtort_chkp); if (rtrcp) { WRITE_ONCE(rp->rtort_chkp, NULL); smp_store_release(&rtrcp->rtc_ready, 1); // Pair with smp_load_acquire(). } i = rp->rtort_pipe_count; if (i > RCU_TORTURE_PIPE_LEN) i = RCU_TORTURE_PIPE_LEN; atomic_inc(&rcu_torture_wcount[i]); WRITE_ONCE(rp->rtort_pipe_count, i + 1); ASSERT_EXCLUSIVE_WRITER(rp->rtort_pipe_count); if (i + 1 >= RCU_TORTURE_PIPE_LEN) { rp->rtort_mbtest = 0; return true; } return false; }