Re: page_waitqueue() considered harmful

"Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx> · Wed, 28 Sep 2016 04:05:30 -0700

On Wed, Sep 28, 2016 at 09:05:46AM +0200, Peter Zijlstra wrote:
> On Wed, Sep 28, 2016 at 03:06:21AM +1000, Nicholas Piggin wrote:
> > On Tue, 27 Sep 2016 18:52:21 +0200
> > Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > 
> > > On Wed, Sep 28, 2016 at 12:53:18AM +1000, Nicholas Piggin wrote:
> > > > The more interesting is the ability to avoid the barrier between fastpath
> > > > clearing a bit and testing for waiters.
> > > > 
> > > > unlock():                        lock() (slowpath):
> > > > clear_bit(PG_locked)             set_bit(PG_waiter)
> > > > test_bit(PG_waiter)              test_bit(PG_locked)

The point being that at least one of the test_bit() calls must return
true?

> > > > If this was memory ops to different words, it would require smp_mb each
> > > > side.. Being the same word, can we avoid them?   
> > > 
> > > Ah, that is the reason I put that smp_mb__after_atomic() there. You have
> > > a cute point on them being to the same word though. Need to think about
> > > that.
> > 
> > This is all assuming the store accesses are ordered, which you should get
> > if the stores to the different bits operate on the same address and size.
> > That might not be the case for some architectures, but they might not
> > require barriers for other reasons. That would call for an smp_mb variant
> > that is used for bitops on different bits but same aligned long. 

As far as I know, all architectures fully order aligned same-size
machine-sized accesses to the same location even without barriers.
In the example above, the PG_locked and PG_waiter are different bits in
the same location, correct?  (Looks that way, but the above also looks
a bit abbreviated.)

Not sure that Docuemntation/memory-barriers.txt is as clear as it
should be on this one, though.

> Since the {set,clear}_bit operations are atomic, they must be ordered
> against one another. The subsequent test_bit is a load, which, since its
> to the same variable, and a CPU must appear to preserve Program-Order,
> must come after the RmW.

But from Documentation/atomic_ops.h:

------------------------------------------------------------------------

	void set_bit(unsigned long nr, volatile unsigned long *addr);
	void clear_bit(unsigned long nr, volatile unsigned long *addr);
	void change_bit(unsigned long nr, volatile unsigned long *addr);

These routines set, clear, and change, respectively, the bit number
indicated by "nr" on the bit mask pointed to by "ADDR".

They must execute atomically, yet there are no implicit memory barrier
semantics required of these interfaces.

------------------------------------------------------------------------

So unless they operate on the same location or are accompanied by
something like the smp_mb__after_atomic() called out above, there
is no ordering.

> So I think you're right and that we can forgo the memory barriers here.
> I even think this must be true on all architectures.
> 
> Paul and Alan have a validation tool someplace, put them on Cc.

It does not yet fully handle atomics yet (but maybe Alan is ahead of
me here, in which case he won't be shy).  However, the point about
strong ordering of same-sized aligned accesses to a machine-sized
location can be made without atomics:

------------------------------------------------------------------------

C C-piggin-SB+samevar.litmus

{
	x=0;
}

P0(int *x)
{
	WRITE_ONCE(*x, 1);
	r1 = READ_ONCE(*x);
}

P1(int *x)
{
	WRITE_ONCE(*x, 2);
	r1 = READ_ONCE(*x);
}

exists
(0:r1=0 /\ 1:r1=0)

------------------------------------------------------------------------

This gets us the following, as expected:

Test C-piggin-SB+samevar Allowed
States 3
0:r1=1; 1:r1=1;
0:r1=1; 1:r1=2;
0:r1=2; 1:r1=2;
No
Witnesses
Positive: 0 Negative: 4
Condition exists (0:r1=0 /\ 1:r1=0)
Observation C-piggin-SB+samevar Never 0 4
Hash=31466d003c10c07836583f5879824502

							Thanx, Paul

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>