> Begin forwarded message: > > From: Christoph Hellwig <hch@xxxxxx> > Subject: [PATCH v2] nfsd: special case truncates some more > Date: January 24, 2017 at 3:22:41 AM EST > To: bfields@xxxxxxxxxxxx, jlayton@xxxxxxxxxxxxxxx > Cc: trondmy@xxxxxxxxxxxxxxx, linux-nfs@xxxxxxxxxxxxxxx > > Both the NFS protocols and the Linux VFS use a setattr operation with a > bitmap of attributs to set to set various file attributes including the > file size and the uid/gid. > > The Linux syscalls never mixe size updates with unrelated updates like > the uid/gid, and some file systems like XFS and GFS2 rely on the fact > that truncates might not update random other attributes, and many > other file systems handle the case but do not update the different > attributes in the same transaction. NFSD on the other hand passes > the attributes it gets on the wire more or less directly through to > the VFS, leading to updates the file systems don't expect. XFS at > least has an assert on the allowed attributes, which cought an NFS > client sets the size and group at the same time. > > To handles this issue properly this switches nfsd to call vfs_truncate > for size changes, and then handling all other attributes through > notify_change. As a side effect this also means less boilerplace > code around the size change as we can now reuse the VFS code. > > Signed-off-by: Christoph Hellwig <hch@xxxxxx> > --- > > Changes since V1: > - avoid an extra setattr just for a MTIME attribute on the wire > > fs/nfsd/vfs.c | 97 +++++++++++++++++++++++------------------------------------ > 1 file changed, 37 insertions(+), 60 deletions(-) > > diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c > index 26c6fdb..e91107a 100644 > --- a/fs/nfsd/vfs.c > +++ b/fs/nfsd/vfs.c > @@ -332,37 +332,6 @@ nfsd_sanitize_attrs(struct inode *inode, struct iattr *iap) > } > } > > -static __be32 > -nfsd_get_write_access(struct svc_rqst *rqstp, struct svc_fh *fhp, > - struct iattr *iap) > -{ > - struct inode *inode = d_inode(fhp->fh_dentry); > - int host_err; > - > - if (iap->ia_size < inode->i_size) { > - __be32 err; > - > - err = nfsd_permission(rqstp, fhp->fh_export, fhp->fh_dentry, > - NFSD_MAY_TRUNC | NFSD_MAY_OWNER_OVERRIDE); > - if (err) > - return err; > - } > - > - host_err = get_write_access(inode); > - if (host_err) > - goto out_nfserrno; > - > - host_err = locks_verify_truncate(inode, NULL, iap->ia_size); > - if (host_err) > - goto out_put_write_access; > - return 0; > - > -out_put_write_access: > - put_write_access(inode); > -out_nfserrno: > - return nfserrno(host_err); > -} > - > /* > * Set various file attributes. After this call fhp needs an fh_put. > */ > @@ -377,7 +346,6 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap, > __be32 err; > int host_err; > bool get_write_count; > - int size_change = 0; > > if (iap->ia_valid & (ATTR_ATIME | ATTR_MTIME | ATTR_SIZE)) > accmode |= NFSD_MAY_WRITE|NFSD_MAY_OWNER_OVERRIDE; > @@ -390,11 +358,11 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap, > /* Get inode */ > err = fh_verify(rqstp, fhp, ftype, accmode); > if (err) > - goto out; > + return err; > if (get_write_count) { > host_err = fh_want_write(fhp); > if (host_err) > - return nfserrno(host_err); > + goto out_host_err; > } > > dentry = fhp->fh_dentry; > @@ -405,50 +373,59 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap, > iap->ia_valid &= ~ATTR_MODE; > > if (!iap->ia_valid) > - goto out; > + return 0; > > nfsd_sanitize_attrs(inode, iap); > > + if (check_guard && guardtime != inode->i_ctime.tv_sec) > + return nfserr_notsync; > + > /* > * The size case is special, it changes the file in addition to the > - * attributes. > + * attributes, and file systems don't expect it to be mixed with > + * "random" attribute changes. We thus split out the size change > + * into a separate calo for vfs_truncate, and do the rest as a > + * a separate setattr call. > */ > if (iap->ia_valid & ATTR_SIZE) { > - err = nfsd_get_write_access(rqstp, fhp, iap); > - if (err) > - goto out; > - size_change = 1; > + struct path path = { > + .mnt = fhp->fh_export->ex_path.mnt, > + .dentry = dentry, > + }; > + bool implicit_mtime = false; > > /* > - * RFC5661, Section 18.30.4: > - * Changing the size of a file with SETATTR indirectly > - * changes the time_modify and change attributes. > - * > - * (and similar for the older RFCs) > + * vfs_truncate implicity updates the mtime IFF the file size > + * actually changes. Avoid the additional seattr call below if > + * the only other attribute that the client sends is the mtime. > */ > - if (iap->ia_size != i_size_read(inode)) > - iap->ia_valid |= ATTR_MTIME; > - } > + if (iap->ia_size != i_size_read(inode) && > + ((iap->ia_valid & ~(ATTR_SIZE | ATTR_MTIME)) == 0)) > + implicit_mtime = true; > > - iap->ia_valid |= ATTR_CTIME; > + host_err = vfs_truncate(&path, iap->ia_size); > + if (host_err) > + goto out_host_err; > > - if (check_guard && guardtime != inode->i_ctime.tv_sec) { > - err = nfserr_notsync; > - goto out_put_write_access; > + iap->ia_valid &= ~ATTR_SIZE; > + if (implicit_mtime) > + iap->ia_valid &= ~ATTR_MTIME; > + if (!iap->ia_valid) > + goto done; > } > > + iap->ia_valid |= ATTR_CTIME; > + > fh_lock(fhp); > host_err = notify_change(dentry, iap, NULL); > fh_unlock(fhp); > - err = nfserrno(host_err); > + if (host_err) > + goto out_host_err; > > -out_put_write_access: > - if (size_change) > - put_write_access(inode); > - if (!err) > - err = nfserrno(commit_metadata(fhp)); > -out: > - return err; > +done: > + host_err = commit_metadata(fhp); > +out_host_err: > + return nfserrno(host_err); > } > > #if defined(CONFIG_NFSD_V4) > -- > 2.1.4 After installing v4.10-rc7 on my test NFS server, I'm seeing a failure with the git regression test suite. *** t1050-large.sh *** ok 1 - setup ok 2 - add a large file or two ok 3 - checkout a large file not ok 4 - packsize limit # # test_create_repo mid && # ( # cd mid && # git config core.bigfilethreshold 64k && # git config pack.packsizelimit 256k && # # # mid1 and mid2 will fit within 256k limit but # # appending mid3 will bust the limit and will # # result in a separate packfile. # test-genrandom "a" $(( 66 * 1024 )) >mid1 && # test-genrandom "b" $(( 80 * 1024 )) >mid2 && # test-genrandom "c" $(( 128 * 1024 )) >mid3 && # git add mid1 mid2 mid3 && # # count=0 # for pi in .git/objects/pack/pack-*.idx # do # test -f "$pi" && count=$(( $count + 1 )) # done && # test $count = 2 && # # ( # git hash-object --stdin <mid1 # git hash-object --stdin <mid2 # git hash-object --stdin <mid3 # ) | # sort >expect && # # for pi in .git/objects/pack/pack-*.idx # do # git show-index <"$pi" # done | # sed -e "s/^[0-9]* \([0-9a-f]*\) .*/\1/" | # sort >actual && # # test_cmp expect actual # ) # ok 5 - diff --raw ok 6 - diff --stat ok 7 - diff Confirmed with NFSv3 / TCP and NFSv4.0 / TCP. Failure is 100% reproducible. If I revert the above commit on the server, the git regression test suite passes. The test is run like this: "cd /mnt/server/git; make clean; make; make test" Can anyone else reproduce this failure? -- Chuck Lever chucklever@xxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html