Re: [PATCH 20/45] NFS: introduce writeback wait queue

Wu Fengguang <fengguang.wu@xxxxxxxxx> · Wed, 7 Oct 2009 17:07:22 +0800

On Wed, Oct 07, 2009 at 04:53:20PM +0800, Peter Zijlstra wrote:
> On Wed, 2009-10-07 at 15:38 +0800, Wu Fengguang wrote:
> > plain text document attachment (writeback-nfs-request-queue.patch)
> > The generic writeback routines are departing from congestion_wait()
> > in preferance of get_request_wait(), aka. waiting on the block queues.
> > 
> > Introduce the missing writeback wait queue for NFS, otherwise its
> > writeback pages may grow out of control.
> > 
> > In perticular, balance_dirty_pages() will exit after it pushes
> > write_chunk pages into the PG_writeback page pool, _OR_ when the
> > background writeback work quits. The latter is new behavior, and could
> > not only quit (normally) after below background threshold, but also
> > quit when it finds _zero_ dirty pages to write. The latter case gives
> > rise to the number of PG_writeback pages if it is not explicitly limited.
> > 
> > CC: Jens Axboe <jens.axboe@xxxxxxxxxx>
> > CC: Chris Mason <chris.mason@xxxxxxxxxx>
> > CC: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
> > CC: Trond Myklebust <Trond.Myklebust@xxxxxxxxxx>
> > Signed-off-by: Wu Fengguang <fengguang.wu@xxxxxxxxx>
> > ---
> > 
> > The wait time and network throughput varies a lot! this is a major problem.
> > This means nfs_end_page_writeback() is not called smoothly over time,
> > even when there are plenty of PG_writeback pages on the client side.
> 
> Could that be some ack batching from the nfs protocol/server?

Yes possibly. Another possibility is some works in the nfsiod
workqueue takes up a long time.

> > +static void nfs_wakeup_congested(long nr, long limit,
> > +				 struct backing_dev_info *bdi,
> > +				 wait_queue_head_t *wqh)
> > +{
> > +	if (nr < 2*limit - min(limit/8, NFS_WAIT_PAGES)) {
> > +		if (test_bit(BDI_sync_congested, &bdi->state))
> > +			clear_bdi_congested(bdi, BLK_RW_SYNC);
> > +		if (waitqueue_active(&wqh[BLK_RW_SYNC])) {
> > +			smp_mb__after_clear_bit();
> > +			wake_up(&wqh[BLK_RW_SYNC]);
> > +		}
> > +	}
> > +	if (nr < limit - min(limit/8, NFS_WAIT_PAGES)) {
> > +		if (test_bit(BDI_async_congested, &bdi->state))
> > +			clear_bdi_congested(bdi, BLK_RW_ASYNC);
> > +		if (waitqueue_active(&wqh[BLK_RW_ASYNC])) {
> > +			smp_mb__after_clear_bit();
> > +			wake_up(&wqh[BLK_RW_ASYNC]);
> > +		}
> > +	}
> > +}
> >  
> 
> wakeup implies a full memory barrier.

If so, this smp_mb__after_clear_bit() line is also not necessary?

void clear_bdi_congested(struct backing_dev_info *bdi, int sync)
{
//...
        clear_bit(bit, &bdi->state);
        smp_mb__after_clear_bit();
        if (waitqueue_active(wqh))
                wake_up(wqh);

Thanks,
Fengguang
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html