On Fri, Aug 26, 2011 at 04:25:15AM -0400, Christoph Hellwig wrote: > > index 13188df..a3d1784 100644 > > --- a/fs/xfs/xfs_trans_ail.c > > +++ b/fs/xfs/xfs_trans_ail.c > > @@ -494,7 +494,7 @@ xfs_ail_worker( > > > > if (push_xfsbufd) { > > /* we've got delayed write buffers to flush */ > > - wake_up_process(mp->m_ddev_targp->bt_task); > > + flush_delayed_work(&mp->m_ddev_targp->bt_delwrite_work); > > This is a huge change in behaviour. wake_up_process just kicks the > thread to wakeup from sleep as soon as the schedule selects it, while > flush_delayed_work does not only queue a pending delayed work, but also > waits for it to finish. Which is precisely what I want here - to wait for all the delwri buffers that were promoted to be submitted before continuing onwards. This makes the scanning algorithm self throttling - instead of simply pushing the buffers to the delwri queue and kicking a background thread and hoping it can flush buffers faster than we can promote them from the AIL, it explicitly pushes the delwri buffers before the next round of AIL scanning. The ensures we start timely IO on the buffers and don't simple continue to scan the AIL while we wait for the background thread to send them off to disk and complete. IOWs, instead of: AIL bufd promote .... promote wakeup short sleep woken sort dispatch .... <sometime later, maybe before, during or after xfsbufd has run> promote ..... promote wakeup short sleep woken sort dispatch .... Where we *hope* the short sleep in the AIL processing is long enough to avoid repeated scanning of the AIL while the queued IO is dispatched. we end up with: AIL bufd promote .... promote flush_work sort dispatch short sleep promote .... promote flush_work sort dispatch short sleep Which is much more controlled and means that the short sleep that that the AIL processing does actually gives time for IO completions to occur before continuing. It means that dispatch of IO from the AIL is throttled to the rate of device congestion as it now waits for the IO dispatch to complete instead of just sholving as much Io as possible into the bufd queue. FWIW, if we move to building a direct IO buffer list in the AIL as we were recently discussing, this is -exactly- the IO dispatch patterns and delays that we will get from the AIL.... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs