Neil, That's the same thing exact fix I started testing on Saturday. I found that there already is a wait_event_timeout (even without your recent changes). The thing I'm not quite sure is what timeout it should use? A quick search through references on LXR shows that it's use anywhere in fs code except for debug cases (btrfs) and network filesystem. - Milosz On Mon, Jul 21, 2014 at 2:40 AM, NeilBrown <neilb@xxxxxxx> wrote: > On Sat, 19 Jul 2014 16:20:01 -0400 Milosz Tanski <milosz@xxxxxxxxx> wrote: > > > Neil, > > > > I saw your recent patcheset for improving the wait_on_bit interface > > (particular: SCHED: allow wait_on_bit_action functions to support a > > timeout.) I'm looking on some guidance on leveraging that work to > > solve other recursive lock hang in fscache. > > > > I've ran into similar issues you're trying to solve with loopback NFS > > but in the fscache code. This happens under heavy vma preasure when > > the kernel is aggressively trying to trim the page cache. > > > > The hang is caused by this serious of events > > 1. cachefiles_write_page - cachefiles (the fscache backend, sitting on > > ext4) tries to write page to disk > > 2. ext4 tries to allocate a page in writeback (without GPF_NOFS and > > with wait flag) > > 3. due to vma preasure the kernel tries to free-up pages > > 4. this causes release pages in ceph to be called > > 5. the selected page is cached page in process of write out (from step > #1) > > 6. fscache_wait_on_page_write hangs forever > > > > Is there a solution that you have to NFS as another patch that > > implements the timeout that I can use a template? I'm not familiar > > with that piece of the code base. > > It looks like the comment in __fscache_maybe_release_page > > /* We will wait here if we're allowed to, but that could deadlock > the > * allocator as the work threads writing to the cache may all end > up > * sleeping on memory allocation, so we may need to impose a > timeout > * too. */ > > is correct when it says "we may need to impose a timeout". > The following __fscache_wait_on_page_write() needs to timeout. > > However that doesn't use wait_on_bit(), it just has a simple wait_event. > So something like this should fix it (or should at least move the problem > along a bit). > > NeilBrown > > > > diff --git a/fs/fscache/page.c b/fs/fscache/page.c > index ed70714503fa..58035024c5cf 100644 > --- a/fs/fscache/page.c > +++ b/fs/fscache/page.c > @@ -43,6 +43,13 @@ void __fscache_wait_on_page_write(struct fscache_cookie > *cookie, struct page *pa > } > EXPORT_SYMBOL(__fscache_wait_on_page_write); > > +void __fscache_wait_on_page_write_timeout(struct fscache_cookie *cookie, > struct page *page, unsigned long timeout) > +{ > + wait_queue_head_t *wq = bit_waitqueue(&cookie->flags, 0); > + > + wait_event_timeout(*wq, !__fscache_check_page_write(cookie, page), > timeout); > +} > + > /* > * decide whether a page can be released, possibly by cancelling a store > to it > * - we're allowed to sleep if __GFP_WAIT is flagged > @@ -115,7 +122,7 @@ page_busy: > } > > fscache_stat(&fscache_n_store_vmscan_wait); > - __fscache_wait_on_page_write(cookie, page); > + __fscache_wait_on_page_write_timeout(cookie, page, HZ); > gfp &= ~__GFP_WAIT; > goto try_again; > } > > > > -- Milosz Tanski CTO 16 East 34th Street, 15th floor New York, NY 10016 p: 646-253-9055 e: milosz@xxxxxxxxx -- Linux-cachefs mailing list Linux-cachefs@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cachefs