> On Sep 19, 2021, at 7:19 PM, Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote: > > On Sun, 2021-09-19 at 23:03 +0000, Chuck Lever III wrote: >> >>> On Jul 23, 2021, at 4:24 PM, Trond Myklebust >>> <trondmy@xxxxxxxxxxxxxxx> wrote: >>> >>> On Fri, 2021-07-23 at 20:12 +0000, Chuck Lever III wrote: >>>> Hi- >>>> >>>> I noticed recently that generic/075, generic/112, and generic/127 >>>> were >>>> failing intermittently on NFSv3 mounts. All three of these tests >>>> are >>>> based on fsx. >>>> >>>> "git bisect" landed on this commit: >>>> >>>> 7b24dacf0840 ("NFS: Another inode revalidation improvement") >>>> >>>> After reverting 7b24dacf0840 on v5.14-rc1, I can no longer >>>> reproduce >>>> the test failures. >>>> >>>> >>> >>> So you are seeing file metadata updates that end up not changing >>> the >>> ctime? >> >> As far as I can tell, a WRITE and two SETATTRs are happening in >> sequence to the same file during the same jiffy. The WRITE does >> not report pre/post attributes, but the SETATTRs do. The reported >> pre- and post- mtime and ctime are all the same value for both >> SETATTRs, I believe due to timestamp_truncate(). >> >> My theory is that persistent-storage-backed filesystems seem to >> go slow enough that it doesn't become a significant problem. But >> with tmpfs, this can happen often enough that the client gets >> confused. And I can make the problem unreproducable if I enable >> enough debugging paraphernalia on the server to slow it down. >> >> I'm not exactly sure how the client becomes confused by this >> behavior, but fsx reports a stale size value, or it can hit a >> bus error. I'm seeing at least four of the fsx-based xfs tests >> fail intermittently. > > It really isn't a client problem then. If the server is failing to > update the timestamps, then you gets what you gets. I don't think it's as simple as that. The Linux VFS has clamped the resolution of file timestamps since before the git era began. See current_time() and its ancestors. The fsx-based tests start failing only after 7b24dacf0840 ("NFS: Another inode revalidation improvement") was applied to the client. So until 7b24dacf0840, things worked despite poor server-side timestamp resolution. In addition, it's not terribly sensible that the client should ignore changes that it made itself simply because the ctime on the server didn't change. m/ctime has been more or less a hint since day one, used to detect possible changes by _other_ clients. Here, the client is doing a SETATTR then throwing away the server-returned attributes and presenting a stale file size from its own cache to an application. That smells awfully like a client regression to me. In any event, as I said above, I'm not exactly sure how the client is becoming confused, so this is not yet a rigorous root-cause analysis. I was simply responding to your question about file metadata updates without a ctime change. Yes, that is happening, but apparently that is actually a pretty normal situation. -- Chuck Lever