Re: fstests failures with NFSD attribute delegation support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/19/24 3:15 PM, Chuck Lever wrote:
On 12/18/24 4:02 PM, Chuck Lever wrote:
Hi -

I'm testing the NFSD support for attribute delegation, and seeing these
two new fstests failures: generic/647 and generic/729. Both tests emit
this error message:

   mmap-rw-fault: pread /media/test/mmap-rw-fault.tmp (O_DIRECT): 0 != 4096: Bad address

This is 100% reproducible with the new patches applied to the server,
and 100% not reproducible when they are not applied on the server.

The failure is due to pread64() (on the client) returning EFAULT. On
the wire, the passing test does:

SETATTR (size = 0)
WRITE (offset = 4096, len = 4096)
READ (offset = 0, len = 8192)
READ (offset = 4096, len = 4096)
SETATTR (size = 0)
  [ continues until test passes ]

The failing test does:

SETATTR (size = 0)
WRITE (offset = 4096, len = 4096)
  [ the failed pread64 seems to occur here ]
CLOSE

In other words, in the failing case, the client does not emit READs
to pull in the changed file content.

The test is using O_DIRECT so I function-traced
nfs_direct_read_schedule_iovec(). In the passing case, this function
generates the usual set of NFS READs on the wire and returns
successfully.

In the failing case, iov_iter_get_pages_alloc2() invokes
get_user_pages_fast(), and that appears to fail immediately:

    mmap-rw-fault-623256 [016] 175303.310394: funcgraph_entry: |        get_user_pages_fast() {     mmap-rw-fault-623256 [016] 175303.310395: funcgraph_entry: |          gup_fast_fallback() {     mmap-rw-fault-623256 [016] 175303.310395: funcgraph_entry: 0.262 us   |          __pte_offset_map();     mmap-rw-fault-623256 [016] 175303.310395: funcgraph_entry: 0.142 us   |          __rcu_read_unlock();     mmap-rw-fault-623256 [016] 175303.310396: funcgraph_entry: 7.824 us   |          __gup_longterm_locked();     mmap-rw-fault-623256 [016] 175303.310404: funcgraph_exit: 8.967 us |        }     mmap-rw-fault-623256 [016] 175303.310404: funcgraph_exit: 9.224 us |      }     mmap-rw-fault-623256 [016] 175303.310404: funcgraph_entry: |        kvfree() {

My guess is the cached inode file size is still zero.

Confirmed: in the failing case, the read fails because the cached
file size is still zero. In the passing case, the cached file size is
8192 before the read.

During the test, the client truncates the file, then performs an NFS
WRITE to the server, extending the size of the file. When an attribute
delegation is in effect, that size extension isn't reflected in the
cached value of i_size -- the client ensures that INVALID_SIZE is
always clear.

But perhaps the NFS client is relying on the client's VFS to maintain
i_size...? The NFS client has its own direct I/O implementation, so
perhaps an i_size update is missing there.

Because the client never retrieves the file's size from the server
during either the passing or failing cases, this appears to be a client
bug.

The bug is in nfs_writeback_update_inode() -- if mtime is delegated, it
skips the file extension check, and the file size cached on the client
remains zero after the WRITE completes.

The culprit is commit e12912d94137 ("NFSv4: Add support for delegated
atime and mtime attributes"). If I remove the hunk that this commit
adds to nfs_writeback_update_inode(), both generic/647 and generic/729
pass.


--
Chuck Lever




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux