On Tue, May 25, 2010 at 06:57:05AM -0400, Jeff Layton wrote: > Since 2.6.34, I've been able to consistently reproduce OOM kills when running fsstress (from the LTP suite) on CIFS. I spent some time yesterday and bisected it down to this patch: > > ---------------------[snip]--------------------- > commit 315e995c63a15cb4d4efdbfd70fe2db191917f7a > Author: Nick Piggin <npiggin@xxxxxxx> > Date: Wed Apr 21 03:18:28 2010 +0000 > > [CIFS] use add_to_page_cache_lru > > add_to_page_cache_lru is exported, so it should be used. Benefits over > using a private pagevec: neater code, 128 bytes fewer stack used, percpu > lru ordering is preserved, and finally don't need to flush pagevec > before returning so batching may be shared with other LRU insertions. > > Signed-off-by: Nick Piggin <npiggin@xxxxxxx> > Reviewed-by: Dave Kleikamp <shaggy@xxxxxxxxxxxxxxxxxx> > Signed-off-by: Steve French <sfrench@xxxxxxxxxx> > ---------------------[snip]--------------------- > > Here's how I've been reproducing it: > > Mount up a samba share with -o sec=krb5i,nounix,noserverino > > Run: fsstress -d /path/to/dir/on/cifs/ -n 1000 -l0 -p8 > > ...within an hour or two, I start getting OOM kills. After backing out > the patch above, I was able to run the test overnight. I'm not sure yet > what the actual problem is, but there seems to be something wrong with > that patch. > > Thoughts? Yep, it's my fault. The problem is the refcounting. Previously the code hands off the references to the LRU, wheras now the lru takes a new reference. (the other filesystems converted to use this function seemed to more conventionally open-code lru_cache_add). Can we get rid of a refcount increment anywhere? Otherwise we'll need to just drop the references after adding the pages. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html