On Fri, 11 Jul 2014, Weston Andros Adamson wrote: > My recent pgio work added the ability to split requests into sub-page > regions, but didn't handle a few places in the writeback code where > requests are looked up by struct page and may already be split into > multiple requests. > > This patchset adds a function "nfs_lock_and_join_requests" in patch > "nfs: handle multiple reqs in nfs_page_async_flush", which: > - takes mutex lock > - looks up head request > - grabs request lock for each subrequest > - if unsuccessful, unrolls old locks and waits on subrequest > - removes all requests from commit lists > - merges range of subrequests into the head requests > - unlinks and destroys the old subrequests. > > The other patches are related fixes. > > The problem showed up when mounting with wsize < PAGE_SIZE - this would > cause multiple requests per page. If a commit failed, nfs_page_async_flush > would operate just on the head request, leading to a hang. > > The nfs_wb_page_cancel patch leverages the same function - > nfs_lock_and_join_requests cancels all operations on the page group. I've had > a really hard time testing nfs_wb_page_cancel, I've only hit it once in weeks of > testing. Any ideas on how to reliably trigger this is appreciated - it's not > as easy as just kicking off a ton of writeback then truncating. The one time I > did see it was with a ton of i/o on a VM with 256M of RAM, which was swapping > like crazy, along with restarting the server repeatedly (to get commit verifier > mismatch). Hey Dros, it's a year later -- but I want to report that this set fixes a rare race where nfs_wb_page_cancel() and nfs_commit_release_pages() both remove the same request from an inode, resulting in an underflow in nfs_inode->nrequests, and finally a BUG (now changed to a WARN) when the inode is cleaned up. The fix is that nfs_lock_and_join_requests() holds i_lock for the duration of checking PagePrivate and trying to lock the nfs_page, and retrying or returning NULL. Previously, nfs_wb_page_cancel() held the i_lock while checking PagePrivate, then released it which allowed the page to be removed and then unlocked in nfs_commit_release_pages(), and then nfs_wb_page_cancel() could lock and remove it as well. I've had good luck exercising nfs_wb_page_cancel() by racing writes and truncates on a mount with small nfs page size.. but there are lots of wait_on_page_writeback() and friends in the pagth that close the race window. Better luck can be had using madvise w/ the hardware poision flag, since that jumps you past a bunch of page invalidation checks, but I wonder if that is a realistic test of real world conditions.. Anyway, thanks for these.. I have to see what of this work I can move back into the already long-in-the-tooth RHEL6. Ben -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html