Re: [QUESTION] Linux memory model: control dependency with bitfield

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jan 9, 2023 at 9:31 AM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
>
> On Mon, Jan 09, 2023 at 09:14:19AM -0800, Yang Shi wrote:
> > Hi Paul,
> >
> > Hope this email finds you are doing well. I recently ran into a
> > problem which might be related to control dependency of the memory
> > model. Conceptually, the code does (from copy_present_pte()):
> >
> > acquire mmap_lock
> > spin_lock
> > ...
> > clear bit (a bit in page flags)
> > ...
> > VM_BUG_ON(test bit)
> > ...
> > spin_unlock
> > release mmap_lock
> >
> >
> > IIUC there is control dependency between the "clear bit" and
> > "VM_BUG_ON" since VM_BUG_ON simply tests the bit then raises the BUG.
> > They do touch the overlapping address (the page flags from the same
> > struct page), but they are bit field operations. Per the memory model
> > documentation, the order is not guaranteed for bit field operations
> > IIRC.
> >
> > And there are not any implicit barriers between clear bit and test
> > bit, so the question is whether an explicit barrier, for example,
> > smp_mb__after_atomic() is required after clear bit to guarantee it
> > works as expected?
>
> I am not familiar with this code, so I will stick with LKMM
> clarifications.

Yeah, sure. This is why I tried to generalize the code.

>
> First, please don't forget any protection and ordering that might be
> provided by the two locks held across this code.

Yes, but for this case I just care about the code between clear bit
and VM_BUG_ON.

>
> Second, a control dependency extends from a READ_ONCE() or stronger
> (clear_bit() included) to a later store.  Please note "store", not
> "load".  If you need to order an earlier READ_ONCE() or clear_bit()

So you mean:

clear bit
...
if (test bit) {
    load_1
    store_1
    load_2
    store_2
}

The dependency reaches to the first store?

> with a later load, you will need acquire semantics (smp_load_acquire(),
> for example) or an explicit barrier such as smp_rmb().  Use of acquire
> semantics almost always gets you code that is more readable.

Does the load acquire have to pair with a smp_store_release()?
smp_mb__after_stomic() is not needed because it is too strong and the
weaker barrier is good enough, right?

>
> Does that help?

Yeah, sure. Thanks.

>
> Also CCing linux-mm@xxxxxxxxx in case someone with better understanding
> of that code has advice.
>
>                                                         Thanx, Paul




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux