On Tue, 2009-10-13 at 20:12 +0200, Jan Kara wrote: > > for (;;) { > > nr_reclaimable = global_page_state(NR_FILE_DIRTY) + > > global_page_state(NR_UNSTABLE_NFS); > > nr_writeback = global_page_state(NR_WRITEBACK) + > > global_page_state(NR_WRITEBACK_TEMP); > > > > global_dirty_thresh(&background_thresh, &dirty_thresh); > > > > /* > > * Throttle it only when the background writeback cannot > > * catch-up. This avoids (excessively) small writeouts > > * when the bdi limits are ramping up. > > */ > > if (nr_reclaimable + nr_writeback < > > (background_thresh + dirty_thresh) / 2) > > break; > > > > bdi_thresh = bdi_dirty_thresh(bdi, dirty_thresh); > > > > /* > > * In order to avoid the stacked BDI deadlock we need > > * to ensure we accurately count the 'dirty' pages when > > * the threshold is low. > > * > > * Otherwise it would be possible to get thresh+n pages > > * reported dirty, even though there are thresh-m pages > > * actually dirty; with m+n sitting in the percpu > > * deltas. > > */ > > if (bdi_thresh < 2*bdi_stat_error(bdi)) { > > bdi_nr_reclaimable = bdi_stat_sum(bdi, BDI_RECLAIMABLE); > > bdi_nr_writeback = bdi_stat_sum(bdi, BDI_WRITEBACK); > > } else { > > bdi_nr_reclaimable = bdi_stat(bdi, BDI_RECLAIMABLE); > > bdi_nr_writeback = bdi_stat(bdi, BDI_WRITEBACK); > > } > > > > /* > > * The bdi thresh is somehow "soft" limit derived from the > > * global "hard" limit. The former helps to prevent heavy IO > > * bdi or process from holding back light ones; The latter is > > * the last resort safeguard. > > */ > > dirty_exceeded = > > (bdi_nr_reclaimable + bdi_nr_writeback >= bdi_thresh) > > || (nr_reclaimable + nr_writeback >= dirty_thresh); > > > > if (!dirty_exceeded) > > break; > > > > bdi->dirty_exceed_time = jiffies; > > > > bdi_writeback_wait(bdi, write_chunk); > Hmm, probably you've discussed this in some other email but why do we > cycle in this loop until we get below dirty limit? We used to leave the > loop after writing write_chunk... So the time we spend in > balance_dirty_pages() is no longer limited, right? Wu was saying that without the loop nr_writeback wasn't limited, but since bdi_writeback_wakeup() is driven from writeout completion, I'm not sure how again that was so. We can move all of bdi_dirty to bdi_writeout, if the bdi writeout queue permits, but it cannot grow beyond the total limit, since we're actually waiting for writeout completion. Possibly unstable is peculiar. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html