Re: [PATCH 01/45] writeback: reduce calls to global_page_state in balance_dirty_pages()

Peter Zijlstra <a.p.zijlstra@xxxxxxxxx> · Tue, 13 Oct 2009 20:28:19 +0200

On Tue, 2009-10-13 at 20:12 +0200, Jan Kara wrote:
> >       for (;;) {
> >               nr_reclaimable = global_page_state(NR_FILE_DIRTY) +
> >                                global_page_state(NR_UNSTABLE_NFS);
> >               nr_writeback = global_page_state(NR_WRITEBACK) +
> >                              global_page_state(NR_WRITEBACK_TEMP);
> > 
> >               global_dirty_thresh(&background_thresh, &dirty_thresh);
> > 
> >               /*
> >                * Throttle it only when the background writeback cannot
> >                * catch-up. This avoids (excessively) small writeouts
> >                * when the bdi limits are ramping up.
> >                */
> >               if (nr_reclaimable + nr_writeback <
> >                   (background_thresh + dirty_thresh) / 2)
> >                       break;
> > 
> >               bdi_thresh = bdi_dirty_thresh(bdi, dirty_thresh);
> > 
> >               /*
> >                * In order to avoid the stacked BDI deadlock we need
> >                * to ensure we accurately count the 'dirty' pages when
> >                * the threshold is low.
> >                *
> >                * Otherwise it would be possible to get thresh+n pages
> >                * reported dirty, even though there are thresh-m pages
> >                * actually dirty; with m+n sitting in the percpu
> >                * deltas.
> >                */
> >               if (bdi_thresh < 2*bdi_stat_error(bdi)) {
> >                       bdi_nr_reclaimable = bdi_stat_sum(bdi, BDI_RECLAIMABLE);
> >                       bdi_nr_writeback = bdi_stat_sum(bdi, BDI_WRITEBACK);
> >               } else {
> >                       bdi_nr_reclaimable = bdi_stat(bdi, BDI_RECLAIMABLE);
> >                       bdi_nr_writeback = bdi_stat(bdi, BDI_WRITEBACK);
> >               }
> > 
> >               /*
> >                * The bdi thresh is somehow "soft" limit derived from the
> >                * global "hard" limit. The former helps to prevent heavy IO
> >                * bdi or process from holding back light ones; The latter is
> >                * the last resort safeguard.
> >                */
> >               dirty_exceeded =
> >                       (bdi_nr_reclaimable + bdi_nr_writeback >= bdi_thresh)
> >                       || (nr_reclaimable + nr_writeback >= dirty_thresh);
> > 
> >               if (!dirty_exceeded)
> >                       break;
> > 
> >               bdi->dirty_exceed_time = jiffies;
> > 
> >               bdi_writeback_wait(bdi, write_chunk);
>   Hmm, probably you've discussed this in some other email but why do we
> cycle in this loop until we get below dirty limit? We used to leave the
> loop after writing write_chunk... So the time we spend in
> balance_dirty_pages() is no longer limited, right?

Wu was saying that without the loop nr_writeback wasn't limited, but
since bdi_writeback_wakeup() is driven from writeout completion, I'm not
sure how again that was so.

We can move all of bdi_dirty to bdi_writeout, if the bdi writeout queue
permits, but it cannot grow beyond the total limit, since we're actually
waiting for writeout completion.

Possibly unstable is peculiar.

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html