Re: page_waitqueue() considered harmful

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 28 Sep 2016 09:05:46 +0200
Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:

> On Wed, Sep 28, 2016 at 03:06:21AM +1000, Nicholas Piggin wrote:
> > On Tue, 27 Sep 2016 18:52:21 +0200
> > Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> >   
> > > On Wed, Sep 28, 2016 at 12:53:18AM +1000, Nicholas Piggin wrote:  
> > > > The more interesting is the ability to avoid the barrier between fastpath
> > > > clearing a bit and testing for waiters.
> > > > 
> > > > unlock():                        lock() (slowpath):
> > > > clear_bit(PG_locked)             set_bit(PG_waiter)
> > > > test_bit(PG_waiter)              test_bit(PG_locked)
> > > > 
> > > > If this was memory ops to different words, it would require smp_mb each
> > > > side.. Being the same word, can we avoid them?     
> > > 
> > > Ah, that is the reason I put that smp_mb__after_atomic() there. You have
> > > a cute point on them being to the same word though. Need to think about
> > > that.  
> > 
> > This is all assuming the store accesses are ordered, which you should get
> > if the stores to the different bits operate on the same address and size.
> > That might not be the case for some architectures, but they might not
> > require barriers for other reasons. That would call for an smp_mb variant
> > that is used for bitops on different bits but same aligned long.   
> 
> Since the {set,clear}_bit operations are atomic, they must be ordered
> against one another. The subsequent test_bit is a load, which, since its
> to the same variable, and a CPU must appear to preserve Program-Order,
> must come after the RmW.
> 
> So I think you're right and that we can forgo the memory barriers here.
> I even think this must be true on all architectures.

In generic code, I don't think so. We'd need an
smp_mb__between_bitops_to_the_same_aligned_long, wouldn't we?

x86 implements set_bit as 'orb (addr),bit_nr', and compiler could
implement test_bit as a byte load as well. If those bits are in
different bytes, then they could be reordered, no?

ia64 does 32-bit ops. If you make PG_waiter 64-bit only and put it
in the different side of the long, then this could be a problem too.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]