On Jan 24, 2014, at 8:52, Jeff Layton <jlayton@xxxxxxxxxx> wrote: > On Wed, 22 Jan 2014 07:04:09 -0500 > Jeff Layton <jlayton@xxxxxxxxxx> wrote: > >> On Wed, 22 Jan 2014 00:24:14 -0800 >> Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote: >> >>> On Tue, Jan 21, 2014 at 02:21:59PM -0500, Jeff Layton wrote: >>>> In any case, this helps but it's a little odd. With this patch, you add >>>> an invalidate_inode_pages2 call prior to doing the DIO. But, you've >>>> also left in the call to nfs_zap_mapping in the completion codepath. >>>> >>>> So now, we shoot down the mapping prior to doing a DIO write, and then >>>> mark the mapping for invalidation again when the write completes. Was >>>> that intentional? >>>> >>>> It seems a little excessive and might hurt performance in some cases. >>>> OTOH, if you mix buffered and DIO you're asking for trouble anyway and >>>> this approach seems to give better cache coherency. >>> >>> Thile follows the model implemented and documented in >>> generic_file_direct_write(). >>> >> >> Ok, thanks. That makes sense, and the problem described in those >> comments is almost exactly the one I've seen in practice. >> >> I'm still not 100% thrilled with the way that the NFS_INO_INVALID_DATA >> flag is handled, but that really has nothing to do with this patchset. >> >> You can add my Tested-by to the set if you like... >> > > (re-sending with Trond's address fixed) > > I may have spoken too soon... > > This patchset didn't fix the problem once I cranked up the concurrency > from 100 child tasks to 1000. I think that HCH's patchset makes sense > and helps narrow the race window some, but the way that > nfs_revalidate_mapping/nfs_invalidate_mapping work is just racy. > > The following patch does seem to fix it however. It's a combination of > a test patch that Trond gave me a while back and another change to > serialize the nfs_invalidate_mapping ops. > > I think it's a reasonable approach to deal with the problem, but we > likely have some other areas that will need similar treatment since > they also check NFS_INO_INVALID_DATA: > > nfs_write_pageuptodate > nfs_readdir_search_for_cookie > nfs_update_inode > > Trond, thoughts? It's not quite ready for merge, but I'd like to get an > opinion on the basic approach, or whether you have an idea of how > better handle the races here: I think that it is reasonable for nfs_revalidate_mapping, but I don’t see how it is relevant to nfs_update_inode or nfs_write_pageuptodate. Readdir already has its own locking at the VFS level, so we shouldn’t need to care there. Cheers Trond -- Trond Myklebust Linux NFS client maintainer -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html