Re: fstests failures with NFSD attribute delegation support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 2024-12-29 at 17:37 -0500, Chuck Lever wrote:
> On 12/19/24 3:15 PM, Chuck Lever wrote:
> > On 12/18/24 4:02 PM, Chuck Lever wrote:
> > > Hi -
> > > 
> > > I'm testing the NFSD support for attribute delegation, and seeing
> > > these
> > > two new fstests failures: generic/647 and generic/729. Both tests
> > > emit
> > > this error message:
> > > 
> > >    mmap-rw-fault: pread /media/test/mmap-rw-fault.tmp (O_DIRECT):
> > > 0 != 
> > > 4096: Bad address
> > > 
> > > This is 100% reproducible with the new patches applied to the
> > > server,
> > > and 100% not reproducible when they are not applied on the
> > > server.
> > > 
> > > The failure is due to pread64() (on the client) returning EFAULT.
> > > On
> > > the wire, the passing test does:
> > > 
> > > SETATTR (size = 0)
> > > WRITE (offset = 4096, len = 4096)
> > > READ (offset = 0, len = 8192)
> > > READ (offset = 4096, len = 4096)
> > > SETATTR (size = 0)
> > >   [ continues until test passes ]
> > > 
> > > The failing test does:
> > > 
> > > SETATTR (size = 0)
> > > WRITE (offset = 4096, len = 4096)
> > >   [ the failed pread64 seems to occur here ]
> > > CLOSE
> > > 
> > > In other words, in the failing case, the client does not emit
> > > READs
> > > to pull in the changed file content.
> > > 
> > > The test is using O_DIRECT so I function-traced
> > > nfs_direct_read_schedule_iovec(). In the passing case, this
> > > function
> > > generates the usual set of NFS READs on the wire and returns
> > > successfully.
> > > 
> > > In the failing case, iov_iter_get_pages_alloc2() invokes
> > > get_user_pages_fast(), and that appears to fail immediately:
> > > 
> > >     mmap-rw-fault-623256 [016] 175303.310394:
> > > funcgraph_entry:         
> > > >        get_user_pages_fast() {
> > >     mmap-rw-fault-623256 [016] 175303.310395:
> > > funcgraph_entry:         
> > > >          gup_fast_fallback() {
> > >     mmap-rw-fault-623256 [016] 175303.310395: funcgraph_entry:
> > > 0.262 
> > > us   |          __pte_offset_map();
> > >     mmap-rw-fault-623256 [016] 175303.310395: funcgraph_entry:
> > > 0.142 
> > > us   |          __rcu_read_unlock();
> > >     mmap-rw-fault-623256 [016] 175303.310396: funcgraph_entry:
> > > 7.824 
> > > us   |          __gup_longterm_locked();
> > >     mmap-rw-fault-623256 [016] 175303.310404: funcgraph_exit:
> > > 8.967 us 
> > > >         }
> > >     mmap-rw-fault-623256 [016] 175303.310404: funcgraph_exit:
> > > 9.224 us 
> > > >       }
> > >     mmap-rw-fault-623256 [016] 175303.310404:
> > > funcgraph_entry:         
> > > >        kvfree() {
> > > 
> > > My guess is the cached inode file size is still zero.
> > 
> > Confirmed: in the failing case, the read fails because the cached
> > file size is still zero. In the passing case, the cached file size
> > is
> > 8192 before the read.
> > 
> > During the test, the client truncates the file, then performs an
> > NFS
> > WRITE to the server, extending the size of the file. When an
> > attribute
> > delegation is in effect, that size extension isn't reflected in the
> > cached value of i_size -- the client ensures that INVALID_SIZE is
> > always clear.
> > 
> > But perhaps the NFS client is relying on the client's VFS to
> > maintain
> > i_size...? The NFS client has its own direct I/O implementation, so
> > perhaps an i_size update is missing there.
> 
> Because the client never retrieves the file's size from the server
> during either the passing or failing cases, this appears to be a
> client
> bug.
> 
> The bug is in nfs_writeback_update_inode() -- if mtime is delegated,
> it
> skips the file extension check, and the file size cached on the
> client
> remains zero after the WRITE completes.
> 
> The culprit is commit e12912d94137 ("NFSv4: Add support for delegated
> atime and mtime attributes"). If I remove the hunk that this commit
> adds to nfs_writeback_update_inode(), both generic/647 and
> generic/729
> pass.
> 
> 

I'm confused... If O_DIRECT is set on open(), then the NFSv4.x (x>0)
client will set NFS4_SHARE_WANT_NO_DELEG. Furthermore, it should not
set either NFS4_SHARE_WANT_DELEG_TIMESTAMPS or
NFS4_SHARE_WANT_OPEN_XOR_DELEGATION.

So why is that commit relevant?

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx






[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux