On Mon, Oct 30, 2023 at 09:21:38AM +0100, Peter Zijlstra wrote: > On Fri, Oct 27, 2023 at 04:41:30PM -0700, Paul E. McKenney wrote: > > On Sat, Oct 28, 2023 at 12:46:28AM +0200, Peter Zijlstra wrote: > > > > Nah, this is more or less what I feared. I just worry people will come > > > around and put WRITE_ONCE() on the other end. I don't think that'll buy > > > us much. Nor do I think the current READ_ONCE()s actually matter. > > > > My friend, you trust compilers more than I ever will. ;-) > > Well, we only use the values {0,1,2}, that's contained in the first > byte. Are we saying compiler will not only byte-split but also > bit-split the loads? > > But again, lacking the WRITE_ONCE() counterpart, this READ_ONCE() isn't > getting you anything, and if you really worried about it, shouldn't you > have proposed a patch making it all WRITE_ONCE() back when you did this > tasks-rcu stuff? There are not all that many of them. If such a WRITE_ONCE() patch would be welcome, I would be happy to put it together. > > > But perhaps put a comment there, that we don't care for the races and > > > only need to observe a 0 once or something. > > > > There are these two passagers in the big lock comment preceding the > > RCU Tasks code: > > > // rcu_tasks_pregp_step(): > > // Invokes synchronize_rcu() in order to wait for all in-flight > > // t->on_rq and t->nvcsw transitions to complete. This works because > > // all such transitions are carried out with interrupts disabled. > > > Does that suffice, or should we add more? > > Probably sufficient. If one were to have used the search option :-) > > Anyway, this brings me to nvcsw, exact same problem there, except > possibly worse, because now we actually do care about the full word. > > No WRITE_ONCE() write side, so the READ_ONCE() don't help against > store-tearing (however unlikely that actually is in this case). Again, if such a WRITE_ONCE() patch would be welcome, I would be happy to put it together. > Also, I'm not entirely sure I see why you need on_rq and nvcsw. Would > not nvcsw increasing be enough to know it passed through a quiescent > state? Are you trying to say that if nvcsw hasn't advanced but on_rq is > still 0, nothing has changed and you can proceed? > > Or rather, looking at the code it seems use the inverse, if on_rq, nvcsw > must change. > > Makes sense I suppose, no point waiting for nvcsw to change if the task > never did anything. Exactly, the on_rq check is needed to avoid excessively long grace periods for tasks that are blocked for long periods of time. Thanx, Paul