On Fri, Jun 03, 2011 at 09:12:17AM -0700, Greg Thelen wrote: > When the system is under background dirty memory threshold but a cgroup > is over its background dirty memory threshold, then only writeback > inodes associated with the over-limit cgroup(s). > [..] > -static inline bool over_bground_thresh(void) > +static inline bool over_bground_thresh(struct bdi_writeback *wb, > + struct writeback_control *wbc) > { > unsigned long background_thresh, dirty_thresh; > > global_dirty_limits(&background_thresh, &dirty_thresh); > > - return (global_page_state(NR_FILE_DIRTY) + > - global_page_state(NR_UNSTABLE_NFS) > background_thresh); > + if (global_page_state(NR_FILE_DIRTY) + > + global_page_state(NR_UNSTABLE_NFS) > background_thresh) { > + wbc->for_cgroup = 0; > + return true; > + } > + > + wbc->for_cgroup = 1; > + wbc->shared_inodes = 1; > + return mem_cgroups_over_bground_dirty_thresh(); > } Hi Greg, So all the logic of writeout from mem cgroup works only if system is below background limit. The moment we cross background limit, looks like we will fall back to existing way of writting inodes? This kind of cgroup writeback I think will atleast not solve the problem for CFQ IO controller, as we fall back to old ways of writting back inodes the moment we cross dirty ratio. Also have you done any benchmarking regarding what's the overhead of going through say thousands of inodes to find the inode which is eligible for writeback from a cgroup? I think Dave Chinner had raised this concern in the past. Thanks Vivek -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html