On Mon, 2018-03-12 at 08:07 -0400, Scott Mayhew wrote: > On Thu, 08 Mar 2018, Trond Myklebust wrote: > > > On Thu, 2018-03-08 at 08:09 -0500, Scott Mayhew wrote: > > > Yes, this works. I ran it through a dozen fio runs on v4.1 and > > > 1000 > > > runs > > > of generic/247 on v3/v4.0/v4.1/v4.2 and didn't see any EBUSY > > > errors. > > > Also ran the xfstests "quick" group (~80-90 tests) plus > > > generic/074 > > > on > > > v3/v4.0/v4.1/v4.2. Finally, I double checked the panic on umount > > > issue > > > that dc4fd9ab01ab3 fixed and that still works too. > > > > I took a long hard look at what we actually need in that area of > > the > > code. There are a few things that are still broken there: > > > > Firstly, we want to keep the inode marked as I_DIRTY_DATASYNC as > > long > > as we have stable writes that are undergoing commit or are waiting > > to > > be scheduled. The reason is that ensures sync_inode() behaves > > correctly > > by calling into nfs_write_inode() so that we can schedule COMMITs > > and > > wait for them all to complete. > > Currently we are broken in that nfs_write_inode() will not reset > > I_DIRTY_DATASYNC if there are still COMMITs in flight due to having > > called it with wbc->sync_mode == WB_SYNC_NONE. > > > > Secondly, we want to ensure that if the number of requests is > > > INT_MAX, we loop around and schedule more COMMITs so that > > nfs_commit_inode(inode, FLUSH_SYNC) is reliable on systems with > > lots of > > memory. > > > > Finally, it is worth noting that it's only when called from > > __writeback_single_inode(), and the attempt to clean the inode > > failed > > that we need to reset the inode state. So we can optimise by > > pushing > > those calls to __mark_inode_dirty() into nfs_write_inode(). > > > > So how about the following v2 patch instead? > > 8<-------------------------------------------- > > From 386978cc3ef4494b9f95390747c2268f8318b94b Mon Sep 17 00:00:00 > > 2001 > > From: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> > > Date: Wed, 7 Mar 2018 15:22:31 -0500 > > Subject: [PATCH v2] NFS: Fix unstable write completion > > > > We do want to respect the FLUSH_SYNC argument to nfs_commit_inode() > > to > > ensure that all outstanding COMMIT requests to the inode in > > question are > > complete. Currently we may exit early from both nfs_commit_inode() > > and > > nfs_write_inode() even if there are COMMIT requests in flight, or > > unstable > > writes on the commit list. > > > > In order to get the right semantics w.r.t. sync_inode(), we don't > > need > > to have nfs_commit_inode() reset the inode dirty flags when called > > from > > nfs_wb_page() and/or nfs_wb_all(). We just need to ensure that > > nfs_write_inode() leaves them in the right state if there are > > outstanding > > commits, or stable pages. > > > > Reported-by: Scott Mayhew <smayhew@xxxxxxxxxx> > > Fixes: dc4fd9ab01ab ("nfs: don't wait on commit in > > nfs_commit_inode()...") > > Cc: stable@xxxxxxxxxxxxxxx # v4.5+: 5cb953d4b1e7: NFS: Use an > > atomic_long_t > > Cc: stable@xxxxxxxxxxxxxxx # v4.5+ > > Signed-off-by: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> > > I ran all the same tests as before and this is working fine. > Thanks Scott! -- Trond Myklebust Linux NFS client maintainer, PrimaryData trond.myklebust@xxxxxxxxxxxxxxx ��.n��������+%������w��{.n�����{��w���jg��������ݢj����G�������j:+v���w�m������w�������h�����٥