On Tue, May 25, 2010 at 07:49:29AM -0400, Jeff Layton wrote: > On Tue, 25 May 2010 21:16:39 +1000 > Nick Piggin <npiggin@xxxxxxx> wrote: > > > On Tue, May 25, 2010 at 06:57:05AM -0400, Jeff Layton wrote: > > > Since 2.6.34, I've been able to consistently reproduce OOM kills when running fsstress (from the LTP suite) on CIFS. I spent some time yesterday and bisected it down to this patch: > > > > > > ---------------------[snip]--------------------- > > > commit 315e995c63a15cb4d4efdbfd70fe2db191917f7a > > > Author: Nick Piggin <npiggin@xxxxxxx> > > > Date: Wed Apr 21 03:18:28 2010 +0000 > > > > > > [CIFS] use add_to_page_cache_lru > > > > > > add_to_page_cache_lru is exported, so it should be used. Benefits over > > > using a private pagevec: neater code, 128 bytes fewer stack used, percpu > > > lru ordering is preserved, and finally don't need to flush pagevec > > > before returning so batching may be shared with other LRU insertions. > > > > > > Signed-off-by: Nick Piggin <npiggin@xxxxxxx> > > > Reviewed-by: Dave Kleikamp <shaggy@xxxxxxxxxxxxxxxxxx> > > > Signed-off-by: Steve French <sfrench@xxxxxxxxxx> > > > ---------------------[snip]--------------------- > > > > > > Here's how I've been reproducing it: > > > > > > Mount up a samba share with -o sec=krb5i,nounix,noserverino > > > > > > Run: fsstress -d /path/to/dir/on/cifs/ -n 1000 -l0 -p8 > > > > > > ...within an hour or two, I start getting OOM kills. After backing out > > > the patch above, I was able to run the test overnight. I'm not sure yet > > > what the actual problem is, but there seems to be something wrong with > > > that patch. > > > > > > Thoughts? > > > > Yep, it's my fault. The problem is the refcounting. Previously the > > code hands off the references to the LRU, wheras now the lru takes > > a new reference. (the other filesystems converted to use this > > function seemed to more conventionally open-code lru_cache_add). > > > > Can we get rid of a refcount increment anywhere? Otherwise we'll > > need to just drop the references after adding the pages. > > > > The only caller of this function is cifs_readpages, and I don't see > where it takes any references. I'm guessing that the pages come from > the VFS with the refcount already incremented? Yep. I think we should just page_cache_release after doing the add_to_page_cache_lru, like the generic code does. It's a little suboptimal tinkering with refcounts like that, but a cleanup pass to fix it up all over the tree and allow add_to_page_cache_lru to take over a reference would be better place to fix that. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html