I'll just bump this once before letting it slip into the ether. Matt Pallissard On 2020-02-27T08:28:43, Pallissard, Matthew wrote: > > Forgive me if this is the wrong list. > > Ok, I have this super infrequent data corruption on write that seems to be limited to nfsv3 async mounts. I have not tested nfsv4 yet. I _think_ I've narrowed down to the 5.5.0 > X >= 5.1.4 (maybe earlier) kernels. I had some users report they had random data corruption. A bit of testing shows that it's reproducible and the corruption is nearly identical every time. > > I'd like to get to the bottom of this so I can guarantee that a kernel upgrade will resolve the issue. > > What winds up happening is every several hundred GiB[ish] we wind up with the first half of a 64 bit segment corrupted. Here is some example output from a test. My test writes a few Gib, alternating between 64 bits of `0`'s and 64 bits of `1`'s. I then read it in and check the contents. Re-reading the file shows that it's corrupted on write, not read. > > > 2020-02-14 11:04:34 crit found mis-match on word segment 11911168 / 33554432! > > 2020-02-14 11:04:34 crit found mis-match on byte 7, 188 != 255 > > 2020-02-14 11:04:34 crit found mis-match on byte 6, 0 != 255 > > 2020-02-14 11:04:34 crit found mis-match on byte 5, 16 != 255 > > 2020-02-14 11:04:34 crit found mis-match on byte 4, 128 != 255 > > 2020-02-14 11:04:34 crit 1011110000000000000100001000000011111111111111111111111111111111 > > > 2020-02-14 13:38:11 crit found mis-match on word segment 1982464 / 33554432! > > 2020-02-14 13:38:11 crit found mis-match on byte 7, 188 != 255 > > 2020-02-14 13:38:11 crit found mis-match on byte 6, 0 != 255 > > 2020-02-14 13:38:11 crit found mis-match on byte 5, 16 != 255 > > 2020-02-14 13:38:11 crit found mis-match on byte 4, 128 != 255 > > 2020-02-14 13:38:11 crit 1011110000000000000100001000000011111111111111111111111111111111 > > > Knowns; > > * does not appear to happen on CentOS/EL 3.10 series kernel > > * does not appear to happen on a 5.5 series kernel > * I'm re-running all my tests now to confirm this. > > * not hardware dependent > > * not processor dependent > * I tested 3 different Intel processors > > * appears to only happen on NFS v3 async mounts > * local disk and `-o sync` NFS v3 mounts have been tested > > * It happens on random 64 bit segments > > * It's *always* the same 4 bytes that are corrupted > > * While often identical, the corrupted bytes are not always identical > * the identical corruption pattern can appear on separate computers. > > * It's *always* on words that are written with `1`'s <- this is the part I find most interesting > > * whether or not I explicitly call `fflush` and `sync` has no effect on the results. > > * usually takes ~80-2000Gib to reproduce, sometimes higher or lower but infrequent. > * I've been writing 2GiB files > * sometimes I never hit the corruption case. > > * I've yet to see more than one corrupted segment in a file. > > > A little bit about the build/run environments; > > the hardware > CentOS 7. > CentOS glibc 2.17 > clang 9 / lld > Dell PowerEdge R620 > Dell PowerEdge C6320 > Dell PowerEdge C6420 > Intel(R) Xeon(R) Gold 6230 CPU @ 2.10GHz > Intel(R) Xeon(R) CPU E5-2660 v4 @ 2.00GHz > Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz > > * I did compile locally on every box. I also tested every compiled binary on every box. It didn't seem to affect the results. > * I don't have a tcpdump of this yet. I'm hoping to get that started before the end of the week. > * I read and write to the same file every time, unlinking it before writing again > * I have not tried dropping the cache between any of the steps. > * I have engaged our storage vendor to see what they have to say. They're pretty good at getting useful metrics and insight so if there is anything I should have them gather server-side please let me know. > > > If anyone as any insight or additional testing I can perform I would *greatly* appreciate it. I would be thrilled if this turned out to be some dumb configuration option or other operational thing performed incorrectly. > > > Thank you for your time. > > Matt Pallissard