On Thu, 27 Jul 2023 at 08:44, Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: > > This reads a little oddly, perhaps because it's a fragment from a larger > piece of code. Yes. As Jann already said, this is basically a preparatory step in a much longer sequence, and the code simply wants to make sure that any later code (possibly quite a bit later) will not see a NULL value. I do believe it happens to be safe to use READ_ONCE() for a number of reasons, but I will argue that we should *never* use a bare READ_ONCE if there is any actual real question about what any memory ordering might be. And yes, the RCU code obviously does use READ_ONCE(), so that's why I say "bare" - the RCU code wraps it in helper accessors with strict semantics. The reason I think it would be techncially *safe* to do here is that (a) code like this: if (READ_ONCE(..)) may end up having the conditional speculated by the CPU, and have the actual read done without any ordering by the CPU, but we do have *one* guarantee: the memory load instruction will be before the conditional branch (or whatever) in the instruction stream. So the code may be *executed* out of order, but we know the memory load can not be moved after the conditional (whatever form that conditional then takes) by a compiler. (We do have various barriers like "rcu_read_unlock()" that guarantees that the READ_ONCE() couldn't be moved lmuch later even in the absence of the conditional, but we can ignore that). (b) the later use of the anon_vma (that depends on the value being stable) is *so* much later, and behind things that the compiler sees as barriers (again, that rcu_read_unlock() acts at a minimum as a compiler barrier) that any subsequent use would not have its load moved down to before the READ_ONCE() in the instruction stream. Again, this is purely a "location in the instruction stream" ordering argument, not a "execution order" ordering argument. And finally (c) I do think that all our architectures have the rules that when you read from the *same* memory location from the same CPU, the accesses are ordered. Now, I didn't actually find (c) spelled out anywhere, but even alpha - the worst memory ordering ever devised - had that particular rule (the alpha architecture manual calls it "Location Access Order"). Now, with that said, I did argue to Jann that we should use smp_store_release and smp_load_acquire pairs here, because honestly, the above (a)-(c) argument is too damn subtle for me, and I don't think this particular use requires it. With smp_load_acquire(), you don't need any of the (a)-(c) rules. The rule is "the load is done before any subsequent memory operations". End of story. So while I do think READ_ONCE() is sufficient here, I actually think that once you start going down that path you can argue that READ_ONCE() is actually entirely unnecessary, because we also have other ordering rules that mean that the compiler can't really do anythinig else even *without* the READ_ONCE(). End result: I can trivially extend the (a)-(c) to say "even READ_ONCE() isn't strictly necessary here, because even any access tearing - which won't happen anyway - wouldn't actually change the result. So if we want to make it *obvious* that it's safe, we should use smp_load_acquire(). And if we do find that there are situations where we care so much about the (generally fairly cheap) possible additional synchronization, and we really want to use READ_ONCE() rather than smp_load_acquire(), I'd rather try to wrap a READ_ONCE in some kind of better abstraction (maybe make the conditional part of the operation, and make it clear that you are doing a "one-time test which depends on the same-location rule". Do we even have the same-location rule in the LKMM? Linus