On Sat, May 19, 2018 at 09:09:46AM -0400, Jeff Layton wrote: > On Fri, 2018-05-18 at 18:50 -0400, Theodore Y. Ts'o wrote: > > Hi Matthew, > > > > Commit b4678df184b: "errseq: Always report a writeback error once" > > appears to be causing xfstests regressions. For ext4, running > > "gce-xfstests -c 4k -g auto" will result in reliable shared/298 > > failures which go away if I revert b4678df184b. > > > > Darrick has also reported occasional generic/047 failures, which I > > have seen at least once as well. I believe two are linked, because > > after instrumenting mke2fs in shared/298, the failure is happening > > after creating a new 300 MB file: > > > > dd if=/dev/zero of=$img_file bs=1M count=300 &> /dev/null > > > > creating a new loop device > > > > loop_dev=$(_create_loop_device $img_file) > > > > ... and then run mke2fs on that loop device. > > > > The instrumentation of mke2fs shows that the first fsync() on > > /dev/loop0 (in lib/ext2fs/closefs.c) which is failing with EIO. > > > > I haven't had a chance to really drill down on it, but I think what is > > going on is there is some former test which exercises an error path > > (using dm_error, or some such), and somehow the errseq_t for the loop > > device isn't getting reset, or the inode for the underlying backing > > file, had an unitialized errseq_t. > > > > Can you take a closer look at this? > > > > Thanks, > > > > - Ted > > > > Thanks Ted. I'm not that familiar with the loopdev code, but after > giving it a quick look, I suspect that you're correct. We probably need > to do something like reset the loop device's bd_inode->i_mapping->wb_err > back to zero when we detach the file that backs it. > > I wonder if we could roll a test that would do: > > create a scratch fs on a dm-error dev with a file on it > set up a loop device on that file > have the backing device of the scratch file throw errors > write to the device > detach loop device > clear dm-error condition > delete file and recreate it > attach same loop device to new file > fsync loop device > > My suspicion is that that last fsync would throw an error now and it > wouldn't have before. I /think/ it's because inode_init_always doesn't clear mapping->wb_err (even though it clears mapping->flags) when recycling struct inodes. Will send patch shortly. --D > -- > Jeff Layton <jlayton@xxxxxxxxxx>