On Tue, May 13, 2014 at 04:01:27PM +0200, Peter Zijlstra wrote: > On Tue, May 13, 2014 at 01:50:07PM +0100, Mel Gorman wrote: > > > Anyway, nothing wrong with this patch, however, you could, if you really > > > wanted to push things, also include BH_Lock in that clear :-) > > > > That's a bold strategy Cotton. > > :-) > > > Untested patch on top > > > > ---8<--- > > diff --git a/fs/buffer.c b/fs/buffer.c > > index e80012d..42fcb6d 100644 > > --- a/fs/buffer.c > > +++ b/fs/buffer.c > > @@ -1490,6 +1490,8 @@ static void discard_buffer(struct buffer_head * bh) > > lock_buffer(bh); > > clear_buffer_dirty(bh); > > bh->b_bdev = NULL; > > + > > + smp_mb__before_clear_bit(); > > Not needed. > > > b_state = bh->b_state; > > for (;;) { > > b_state_old = cmpxchg(&bh->b_state, b_state, (b_state & ~BUFFER_FLAGS_DISCARD)); > > @@ -1497,7 +1499,13 @@ static void discard_buffer(struct buffer_head * bh) > > break; > > b_state = b_state_old; > > } > > - unlock_buffer(bh); > > + > > + /* > > + * BUFFER_FLAGS_DISCARD include BH_lock so it has been cleared so the > > + * wake_up_bit is the last part of a unlock_buffer > > + */ > > + smp_mb__after_clear_bit(); > > Similarly superfluous. > > > + wake_up_bit(&bh->b_state, BH_Lock); > > } > > The thing is that cmpxchg() guarantees full barrier semantics before and > after the op, and since the loop guarantees at least one cmpxchg() call > its all good. > Of course, thanks for pointing that out. I was only thinking of it in terms of it being a clear_bit operation which was dumb. > Now just to confuse everyone, you could have written the loop like: > > b_state = bh->b_state; > for (;;) { > b_state_new = b_state & ~BUFFER_FLAGS_DISCARD; > if (b_state == b_state_new) > break; > b_state = cmpxchg(&bh->b_state, b_state, b_state_new); > } > > Which is 'similar' but doesn't guarantee that cmpxchg() gets called. > If you expect the initial value to match the new state, the above form > is slightly faster, but the lack of barrier guarantees can still spoil > the fun. I do not really expect the initial value to match the new state. At the very least I would expect BH_mapped to be routinely cleared during this operation so I doubt it's worth the effort trying to deal with conditional buffers. -- Mel Gorman SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html