On Wed, Sep 16 2009, Jan Kara wrote: > On Tue 15-09-09 20:16:50, Jens Axboe wrote: > > We need to be able to pass in range_cyclic as well, so instead > > of growing yet another argument, split the arguments into a > > struct wb_writeback_args structure that we can use internally. > > Also makes it easier to just copy all members to an on-stack > > struct, since we can't access work after clearing the pending > > bit. > > > > Signed-off-by: Jens Axboe <jens.axboe@xxxxxxxxxx> > > --- > > fs/fs-writeback.c | 97 +++++++++++++++++++++++++++++++---------------------- > > 1 files changed, 57 insertions(+), 40 deletions(-) > > > > diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c > > index 783ed44..1dd11d1 100644 > > --- a/fs/fs-writeback.c > > +++ b/fs/fs-writeback.c > > @@ -35,6 +35,17 @@ > > int nr_pdflush_threads; > > > > /* > > + * Passed into wb_writeback(), essentially a subset of writeback_control > > + */ > > +struct wb_writeback_args { > > + long nr_pages; > > + struct super_block *sb; > > + enum writeback_sync_modes sync_mode; > > + int for_kupdate; > > + int range_cyclic; > > +}; > You can save a few bytes by using bit-fields for for_kupdate and > range_cyclic fields the same way as is done in wbc. I was unsure whether that would be reliable for copying. Seems it should. > > @@ -69,9 +78,10 @@ static inline void bdi_work_init(struct bdi_work *work, > > struct writeback_control *wbc) > > { > > INIT_RCU_HEAD(&work->rcu_head); > > - work->sb = wbc->sb; > > - work->nr_pages = wbc->nr_to_write; > > - work->sync_mode = wbc->sync_mode; > > + work->args.sb = wbc->sb; > > + work->args.nr_pages = wbc->nr_to_write; > > + work->args.sync_mode = wbc->sync_mode; > > + work->args.range_cyclic = wbc->range_cyclic; > > work->state = WS_USED; > > } > We don't pass for_kupdate here. But probably that's fine since noone else > should set it. But maybe we should at least initialize it? Good idea, will set it. > > @@ -673,29 +682,35 @@ static long wb_writeback(struct bdi_writeback *wb, long nr_pages, > > oldest_jif = jiffies - > > msecs_to_jiffies(dirty_expire_interval * 10); > > } > > + if (!wbc.range_cyclic) { > > + wbc.range_start = 0; > > + wbc.range_end = LLONG_MAX; > > + } > > > > for (;;) { > > - /* > > - * Don't flush anything for non-integrity writeback where > > - * no nr_pages was given > > - */ > > - if (!for_kupdate && nr_pages <= 0 && sync_mode == WB_SYNC_NONE) > > - break; > > + if (!args->for_kupdate && args->nr_pages <= 0) { > > + /* > > + * Don't flush anything for non-integrity writeback > > + * where no nr_pages was given > > + */ > > + if (args->sync_mode == WB_SYNC_NONE) > > + break; > > > > - /* > > - * If no specific pages were given and this is just a > > - * periodic background writeout and we are below the > > - * background dirty threshold, don't do anything > > - */ > > - if (for_kupdate && nr_pages <= 0 && !over_bground_thresh()) > > - break; > > + /* > > + * If no specific pages were given and this is just a > > + * periodic background writeout and we are below the > > + * background dirty threshold, don't do anything > > + */ > > + if (!over_bground_thresh()) > > + break; > > + } > This chunk actually changes the behavior since previously the second > condition had for_kupdate and not !for_kupdate. But revisiting this code, > the pdflush-style writeback still doesn't work how it used to. Merging > kupdate-style and pdflush-style writeback isn't that easy. > For the first one we want to flush only the expired inodes, but we should > guarantee we really flush them. > For the second one we want to flush as much data as possible and don't > really care about which inodes we flush. We only have to work until we hit > the background threshold. > So probably check_old_data_flush() has to issue a kupdate-style writeback > like it does now and then it should loop submitting normal WB_SYNC_NONE > writeback until we get below background threshold. In theory, we could use > the look in wb_writeback() like we try now but that would mean we have to > pass another flag, whether this is a pdflush style writeback or not. Good spotting, I'll fix that up and have the old flush continue until !over_bground_thresh(). It would be nice to get rid of the weird special casing in wb_writeback() and just retain it as a worker, pushing the logic into the internal callers. > > wbc.more_io = 0; > > wbc.encountered_congestion = 0; > > wbc.nr_to_write = MAX_WRITEBACK_PAGES; > > wbc.pages_skipped = 0; > > writeback_inodes_wb(wb, &wbc); > > - nr_pages -= MAX_WRITEBACK_PAGES - wbc.nr_to_write; > > + args->nr_pages -= MAX_WRITEBACK_PAGES - wbc.nr_to_write; > > wrote += MAX_WRITEBACK_PAGES - wbc.nr_to_write; > > > > /* > > @@ -749,8 +764,14 @@ static long wb_check_old_data_flush(struct bdi_writeback *wb) > > global_page_state(NR_UNSTABLE_NFS) + > > (inodes_stat.nr_inodes - inodes_stat.nr_unused); > > > > - if (nr_pages) > > - return wb_writeback(wb, nr_pages, NULL, WB_SYNC_NONE, 1); > > + if (nr_pages) { > > + struct wb_writeback_args args = { > > + .nr_pages = nr_pages, > > + .sync_mode = WB_SYNC_NONE, > > + }; > We should set for_kupdate here. Indeed, thanks. -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html