Re: [PATCH] dm-log-writes: invalidate the bdev's for both of our devices

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Nov 28, 2017 at 10:40:24PM +0200, Amir Goldstein wrote:
> On Tue, Nov 28, 2017 at 9:29 PM, Amir Goldstein <amir73il@xxxxxxxxx> wrote:
> > On Tue, Nov 28, 2017 at 7:30 PM, Josef Bacik <josef@xxxxxxxxxxxxxx> wrote:
> >> From: Josef Bacik <jbacik@xxxxxx>
> >>
> >> Amir noticed that sometimes the xfstests using dm-log-writes would fail
> >> randomly but would work fine after trying again manually.  This is
> >> because dm-log-writes writes directly to the device, but the log replay
> >> tools read and write via the block device page cache.  Sometimes this
> >> resulted in stale data being in the block device's page cache which
> >> would result in random failures.  To handle this simply invalidate the
> >> block device page cache on destruction so any replay of the log device
> >> that follows will be forced to read the new real contents.
> >>
> >> Reported-and-tested-by: Amir Goldstein <amir73il@xxxxxxxxx>
> >
> > I'm fine with the Reported-by, but let's wait a while with this patch so
> > I have more time to torture it.
> > The incidents I got even before the patch did not happen more than
> > a handful of times after running for a few days, so I need some more
> > days to validate the fix.
> > I had already sent you some weird output. Let's see what else comes
> > along.
> >
> 
> Sorry, no cigar.
> Another run just completed with Malformed log and corrupted fs
> 
> The _check_scratch_fs that fails is the one right after _log_writes_remove
> just like the report that I sent before this patch
> and the LOGWRITES_DEV itself has malformed entry before the "end" mark
> or even the last fsync mark:
> 
> ./src/log-writes/replay-log -v --log $LOGWRITES_DEV --find --end-mark
> testfile1.mark17
> Malformed entry @112134
> 
> For what its worth, I am testing on spinning disks, 100G scratch dev.
> Right now, I zoomed in on the following fsx seeds that managed to fail the test
> a few times already, but in different ways, so I'm not sure the seeds are more
> than voodoo:
> seeds=(4597 4598 4599 4600)
> 
> I'll start running the same test but with fsx running on test partition, just
> to get the feel for running the same fsx threads on bare xfs.
> 
> Any other ideas?
> 

Is there anything special about your devices?  Are they 4k drives?  The corrupt
log is not awesome, was it still corrupt after the test bailed out?  Thanks,

Josef



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux