Re: recent intermittent fsx-related failures

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Jul 23, 2021, at 4:24 PM, Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote:
> 
> On Fri, 2021-07-23 at 20:12 +0000, Chuck Lever III wrote:
>> Hi-
>> 
>> I noticed recently that generic/075, generic/112, and generic/127
>> were
>> failing intermittently on NFSv3 mounts. All three of these tests are
>> based on fsx.
>> 
>> "git bisect" landed on this commit:
>> 
>> 7b24dacf0840 ("NFS: Another inode revalidation improvement")
>> 
>> After reverting 7b24dacf0840 on v5.14-rc1, I can no longer reproduce
>> the test failures.
>> 
>> 
> 
> So you are seeing file metadata updates that end up not changing the
> ctime?

As far as I can tell, a WRITE and two SETATTRs are happening in
sequence to the same file during the same jiffy. The WRITE does
not report pre/post attributes, but the SETATTRs do. The reported
pre- and post- mtime and ctime are all the same value for both
SETATTRs, I believe due to timestamp_truncate().

My theory is that persistent-storage-backed filesystems seem to
go slow enough that it doesn't become a significant problem. But
with tmpfs, this can happen often enough that the client gets
confused. And I can make the problem unreproducable if I enable
enough debugging paraphernalia on the server to slow it down.

I'm not exactly sure how the client becomes confused by this
behavior, but fsx reports a stale size value, or it can hit a
bus error. I'm seeing at least four of the fsx-based xfs tests
fail intermittently.


--
Chuck Lever







[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux