Jan, On Fri, Aug 06, 2010 at 02:53:18AM +0800, Jan Kara wrote: > Commit 83ba7b071f30f7c01f72518ad72d5cd203c27502 broke writeback_in_progress() > as in that commit we started to remove work items from the list at the > moment we start working on them and not at the moment they are finished. > Thus if the flusher thread was doing some work but there was no other > work queued, writeback_in_progress() returned false. This could in > particular cause unnecessary queueing of background writeback from > balance_dirty_pages() or writeout work from writeback_sb_if_idle(). s/writeback_sb_if_idle/writeback_inodes_sb_if_idle > This patch fixes the problem by introducing a bit in the bdi state which > indicates that the flusher thread is processing some work and uses this > bit for writeback_in_progress() test. There is a possible small downside. Imagine this scenario: - kupdate is running while dirty pages go up to 19% - the dirtier stopped - kupdate quits - dirty pages remain at 18%, over background threshold It looks not a big problem except for surprising some users (like me :). The old code might also stuck in 18% dirty: - there are two works, one queued and another running - during the time dirty pages go up to 19% - the dirtier stopped - the two works quit - dirty pages remain at 18%, over background threshold > NOTE: Both callsites of writeback_in_progress() (namely, > writeback_inodes_sb_if_idle() and balance_dirty_pages()) would actually > need a different information than what writeback_in_progress() provides. > They would need to know whether *the kind of writeback they are going > to submit* is already queued. But this information isn't that simple > to provide so let's fix writeback_in_progress() for the time being. I have an outdated patch that introduced can_submit_background_writeback() in addition to the problematic writeback_in_progress(). It deliberately enqueue a new background job even when there is a running one. Because when the balance_dirty_pages() is converted to wait on the flusher threads, it must guarantee that someone will be there working to wake it up. And the running background job may right in the progress of exiting.. Thanks, Fengguang --- writeback: only allow two background writeback works balance_dirty_pages() need a reliable way to ensure some background work is scheduled for running. We cannot simply run bdi_start_writeback() because that would queue up huge number of works (which takes memory and time). CC: Jens Axboe <jens.axboe@xxxxxxxxxx> Signed-off-by: Wu Fengguang <fengguang.wu@xxxxxxxxx> --- fs/fs-writeback.c | 14 ++------------ include/linux/backing-dev.h | 26 +++++++++++++++++++++++++- 2 files changed, 27 insertions(+), 13 deletions(-) --- linux.orig/fs/fs-writeback.c 2009-11-06 09:52:24.000000000 +0800 +++ linux/fs/fs-writeback.c 2009-11-06 09:52:25.000000000 +0800 @@ -82,18 +82,6 @@ static inline void bdi_work_init(struct work->state = WS_USED; } -/** - * writeback_in_progress - determine whether there is writeback in progress - * @bdi: the device's backing_dev_info structure. - * - * Determine whether there is writeback waiting to be handled against a - * backing device. - */ -int writeback_in_progress(struct backing_dev_info *bdi) -{ - return !list_empty(&bdi->work_list); -} - static void bdi_work_clear(struct bdi_work *work) { clear_bit(WS_USED_B, &work->state); @@ -144,6 +132,8 @@ static void wb_clear_pending(struct bdi_ spin_lock(&bdi->wb_lock); list_del_rcu(&work->list); + if (work->args.for_background) + clear_bit(WB_FLAG_BACKGROUND_WORK, &bdi->wb_mask); spin_unlock(&bdi->wb_lock); wb_work_complete(work); --- linux.orig/include/linux/backing-dev.h 2009-11-06 09:52:25.000000000 +0800 +++ linux/include/linux/backing-dev.h 2009-11-06 09:52:25.000000000 +0800 @@ -94,6 +94,11 @@ struct backing_dev_info { #endif }; +/* + * background work queued, set to avoid redundant background works + */ +#define WB_FLAG_BACKGROUND_WORK 30 + int bdi_init(struct backing_dev_info *bdi); void bdi_destroy(struct backing_dev_info *bdi); @@ -248,7 +253,26 @@ int bdi_set_max_ratio(struct backing_dev extern struct backing_dev_info default_backing_dev_info; void default_unplug_io_fn(struct backing_dev_info *bdi, struct page *page); -int writeback_in_progress(struct backing_dev_info *bdi); +/** + * writeback_in_progress - determine whether there is writeback in progress + * @bdi: the device's backing_dev_info structure. + * + * Determine whether there is writeback waiting to be handled against a + * backing device. + */ +static inline int writeback_in_progress(struct backing_dev_info *bdi) +{ + return !list_empty(&bdi->work_list); +} + +/* + * Helper to limit # of background writeback works in circulation to 2. + * (one running and another queued) + */ +static inline int can_submit_background_writeback(struct backing_dev_info *bdi) +{ + return !test_and_set_bit(WB_FLAG_BACKGROUND_WORK, &bdi->wb_mask); +} static inline int bdi_congested(struct backing_dev_info *bdi, int bdi_bits) { -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html