On Thu, 2011-04-07 at 16:19 +1000, Dave Chinner wrote: > From: Dave Chinner <dchinner@xxxxxxxxxx> > > When the inode cache shrinker runs, we may have lots of dirty inodes queued up > in the VFS dirty queues that have not been expired. The typical case for this > with XFS is atime updates. The result is that a highly concurrent workload that > copies files and then later reads them (say to verify checksums) dirties all > the inodes again, even when relatime is used. > > In a constrained memory environment, this results in a large number of dirty > inodes using all of available memory and memory reclaim being unable to free > them as dirty inodes areconsidered active. This problem was uncovered by Chris > Mason during recent low memory stress testing. > > The fix is to trigger VFS level writeback from the XFS inode cache shrinker if > there isn't already writeback in progress. This ensures that when we enter a > low memory situation we start cleaning inodes (via the flusher thread) on the > filesystem immediately, thereby making it more likely that we will be able to > evict those dirty inodes from the VFS in the near future. > > The mechanism is not perfect - it only acts on the current filesystem, so if > all the dirty inodes are on a different filesystem it won't help. However, it > seems to be a valid assumption is that the filesystem with lots of dirty inodes > is going to have the shrinker called very soon after the memory shortage > begins, so this shouldn't be an issue. > > The other flaw is that there is no guarantee that the flusher thread will make > progress fast enough to clean the dirty inodes so they can be reclaimed in the > near future. However, this mechanism does improve the resilience of the > filesystem under the test conditions - instead of reliably triggering the OOM > killer 20 minutes into the stress test, it took more than 6 hours before it > happened. > > This small addition definitely improves the low memory resilience of XFS on > this type of workload, and best of all it has no impact on performance when > memory is not constrained. > > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> Looks good to me. Reviewed-by: Alex Elder <aelder@xxxxxxx> -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html