On Thu, Mar 18, 2021 at 02:16:13PM -0400, Eric Whitney wrote: > As mentioned in today's ext4 concall, I've seen generic/418 fail from time to > time when run on 5.12-rc3 and 5.12-rc1 kernels. This first occurred when > running the 1k test case using kvm-xfstests. I was then able to bisect the > failure to a patch landed in the -rc1 merge window: > > (bd8a1f3655a7) mm/filemap: support readpage splitting a page > > Typical test output resulting from a failure looks like: > > QA output created by 418 > +cmpbuf: offset 0: Expected: 0x1, got 0x0 > +[6:0] FAIL - comparison failed, offset 3072 > +diotest -w -b 512 -n 8 -i 4 failed at loop 0 > Silence is golden > ... > > I've also been able to reproduce the failure on -rc3 in the 4k test case as > well. The failure frequency there was 10 out of 100 runs. It was anywhere > from 2 to 8 failures out of 100 runs in the 1k case. FWIW, testing on a kernel which is -rc2 based (ext4.git's tip) I wasn't able to see a failure using gce-xfstests using the ext4/4k, ext4/1k, and xfs/1k test scenarios. This may be because of the I/O timing for the persistent disk block device in GCE, or differences in the number of CPU's or amount of memory available --- or in the kernel configuration that was used to build it. I'm currently retrying with -rc3, with and without the kernel debug configs, to see if that makes any difference... - Ted