Andres reported a problem recently where reading a file several times the size of memory causes intermittent stalls. My suspicion is that page allocation eventually runs into the low watermark and starts to do reclaim. Some shrinkers take a long time to run and have a low chance of actually freeing a page (eg the dentry cache needs to free 21 dentries which all happen to be on the same pair of pages to free those two pages). This patch attempts to free pages from the file that we're currently reading from if there are no pages readily available. If that doesn't work, we'll run all the shrinkers just as we did before. This should solve Andres' problem, although it's a bit narrow in scope. It might be better to look through the inactive page list, regardless of which file they were allocated for. That could solve the "weekly backup" problem with lots of little files. I'm not really set up to do performance testing at the moment, so this is just me thinking hard about the problem. diff --git a/mm/readahead.c b/mm/readahead.c index 3c9a8dd7c56c..3531e1808e24 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -111,9 +111,24 @@ int read_cache_pages(struct address_space *mapping, struct list_head *pages, } return ret; } - EXPORT_SYMBOL(read_cache_pages); +/* + * Attempt to detect a streaming workload which exceeds memory and + * handle it by dropping the page cache behind the active part of the + * file. + */ +static void discard_behind(struct file *file, struct address_space *mapping) +{ + unsigned long keep = file->f_ra.ra_pages * 2; + + if (mapping->nrpages < 1000) + return; + if (file->f_ra.start < keep) + return; + invalidate_mapping_pages(mapping, 0, file->f_ra.start - keep); +} + static void read_pages(struct readahead_control *rac, struct list_head *pages, bool skip_page) { @@ -179,6 +194,7 @@ void page_cache_readahead_unbounded(struct address_space *mapping, { LIST_HEAD(page_pool); gfp_t gfp_mask = readahead_gfp_mask(mapping); + gfp_t light_gfp = gfp_mask & ~__GFP_DIRECT_RECLAIM; struct readahead_control rac = { .mapping = mapping, .file = file, @@ -219,7 +235,11 @@ void page_cache_readahead_unbounded(struct address_space *mapping, continue; } - page = __page_cache_alloc(gfp_mask); + page = __page_cache_alloc(light_gfp); + if (!page) { + discard_behind(file, mapping); + page = __page_cache_alloc(gfp_mask); + } if (!page) break; if (mapping->a_ops->readpages) {