> On Jul 27, 2023, at 7:57 AM, Will Deacon <will@xxxxxxxxxx> wrote: > > On Thu, Jul 27, 2023 at 04:39:34PM +0200, Jann Horn wrote: >> On Thu, Jul 27, 2023 at 1:19 AM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote: >>> >>> On Wed, Jul 26, 2023 at 11:41:01PM +0200, Jann Horn wrote: >>>> Hi! >>>> >>>> Patch 1 here is a straightforward fix for a race in per-VMA locking code >>>> that can lead to use-after-free; I hope we can get this one into >>>> mainline and stable quickly. >>>> >>>> Patch 2 is a fix for what I believe is a longstanding memory ordering >>>> issue in how vma->anon_vma is used across the MM subsystem; I expect >>>> that this one will have to go through a few iterations of review and >>>> potentially rewrites, because memory ordering is tricky. >>>> (If someone else wants to take over patch 2, I would be very happy.) >>>> >>>> These patches don't really belong together all that much, I'm just >>>> sending them as a series because they'd otherwise conflict. >>>> >>>> I am CCing: >>>> >>>> - Suren because patch 1 touches his code >>>> - Matthew Wilcox because he is also currently working on per-VMA >>>> locking stuff >>>> - all the maintainers/reviewers for the Kernel Memory Consistency Model >>>> so they can help figure out the READ_ONCE() vs smp_load_acquire() >>>> thing >>> >>> READ_ONCE() has weaker ordering properties than smp_load_acquire(). >>> >>> For example, given a pointer gp: >>> >>> p = whichever(gp); >>> a = 1; >>> r1 = p->b; >>> if ((uintptr_t)p & 0x1) >>> WRITE_ONCE(b, 1); >>> WRITE_ONCE(c, 1); >>> >>> Leaving aside the "&" needed by smp_load_acquire(), if "whichever" is >>> "READ_ONCE", then the load from p->b and the WRITE_ONCE() to "b" are >>> ordered after the load from gp (the former due to an address dependency >>> and the latter due to a (fragile) control dependency). The compiler >>> is within its rights to reorder the store to "a" to precede the load >>> from gp. The compiler is forbidden from reordering the store to "c" >>> wtih the load from gp (because both are volatile accesses), but the CPU >>> is completely within its rights to do this reordering. >>> >>> But if "whichever" is "smp_load_acquire()", all four of the subsequent >>> memory accesses are ordered after the load from gp. >>> >>> Similarly, for WRITE_ONCE() and smp_store_release(): >>> >>> p = READ_ONCE(gp); >>> r1 = READ_ONCE(gi); >>> r2 = READ_ONCE(gj); >>> a = 1; >>> WRITE_ONCE(b, 1); >>> if (r1 & 0x1) >>> whichever(p->q, r2); >>> >>> Again leaving aside the "&" needed by smp_store_release(), if "whichever" >>> is WRITE_ONCE(), then the load from gp, the load from gi, and the load >>> from gj are all ordered before the store to p->q (by address dependency, >>> control dependency, and data dependency, respectively). The store to "a" >>> can be reordered with the store to p->q by the compiler. The store to >>> "b" cannot be reordered with the store to p->q by the compiler (again, >>> both are volatile), but the CPU is free to reorder them, especially when >>> whichever() is implemented as a conditional store. >>> >>> But if "whichever" is "smp_store_release()", all five of the earlier >>> memory accesses are ordered before the store to p->q. >>> >>> Does that help, or am I missing the point of your question? >> >> My main question is how permissible/ugly you think the following use >> of READ_ONCE() would be, and whether you think it ought to be an >> smp_load_acquire() instead. >> >> Assume that we are holding some kind of lock that ensures that the >> only possible concurrent update to "vma->anon_vma" is that it changes >> from a NULL pointer to a non-NULL pointer (using smp_store_release()). >> >> >> if (READ_ONCE(vma->anon_vma) != NULL) { >> // we now know that vma->anon_vma cannot change anymore >> >> // access the same memory location again with a plain load >> struct anon_vma *a = vma->anon_vma; >> >> // this needs to be address-dependency-ordered against one of >> // the loads from vma->anon_vma >> struct anon_vma *root = a->root; >> } >> >> >> Is this fine? If it is not fine just because the compiler might >> reorder the plain load of vma->anon_vma before the READ_ONCE() load, >> would it be fine after adding a barrier() directly after the >> READ_ONCE()? > > I'm _very_ wary of mixing READ_ONCE() and plain loads to the same variable, > as I've run into cases where you have sequences such as: > > // Assume *ptr is initially 0 and somebody else writes it to 1 > // concurrently > > foo = *ptr; > bar = READ_ONCE(*ptr); > baz = *ptr; > > and you can get foo == baz == 0 but bar == 1 because the compiler only > ends up reading from memory twice. > > That was the root cause behind f069faba6887 ("arm64: mm: Use READ_ONCE > when dereferencing pointer to pte table"), which was very unpleasant to > debug. Interesting. I wonder if you considered adding to READ_ONCE() something like: asm volatile("" : "+g" (x) ); So later loads (such as baz = *ptr) would reload the updated value.