Re: [PATCH v2] generic: disable dmlogwrites tests on XFS

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]



On Mon, Aug 31, 2020 at 4:37 PM Brian Foster <bfoster@xxxxxxxxxx> wrote:
>
> On Sat, Aug 29, 2020 at 07:48:50AM +0100, Christoph Hellwig wrote:
> > On Thu, Aug 27, 2020 at 10:53:29AM -0400, Brian Foster wrote:
> > > Several generic fstests use dm-log-writes to test the filesystem for
> > > consistency at various crash recovery points. dm-log-writes and the
> > > associated replay mechanism rely on discard to clear stale blocks
> > > when moving to various points in time of the fs. If the storage
> > > doesn't provide discard zeroing or the discard requests exceed the
> > > hardcoded maximum (128MB) of the fallback solution to physically
> > > write zeroes, stale blocks are left around in the target fs. This
> > > causes issues on XFS if recovery observes metadata from a future
> > > version of an fs that has been replayed to an older point in time.
> > > This corrupts the filesystem and leads to spurious test failures
> > > that are nontrivial to diagnose.
> > >
> > > Disable the generic dmlogwrites tests on XFS for the time being.
> > > This is intended to be a temporary change until a solution is found
> > > that allows these tests to predictably clear stale data while still
> > > allowing them to run in a reasonable amount of time.
> >
> > As said in the other discussion I don't think this is correct.  The
> > intent of the tests is to ensure the data can't be read.  You just
> > happen to trigger over that with XFS, but it also means that tests
> > don't work correctly on other file systems in that configuration.
> >
>
> Yes, but the goal of this patch is not to completely fix the dmlogwrites
> infrastructure and set of tests. The goal is to disable a subset of
> tests that are known to produce spurious corruptions on XFS until that
> issue can be addressed, so it doesn't result in continued bug reports in
> the meantime. I don't run these tests routinely on other fs', so it's
> not really my place to decide that the tradeoff between this problem and
> the ability of the test to reproduce legitimate bugs justifies disabling
> the test on those configs.
>

Brian,

Let's not take this course please.
Please post patches v1 2/4-4/4 without patch v1 1/4
The only objection was to patch 1/4 and it is not strictly needed
to solve the problem you care about.

I had a *concern* about pacthes 2-4, but I can live with that
concern and it is certainly preferred to disabling the tests.

I can follow up with fixing the dmlogwrites common helpers
later when I get the time, so they do not rely on discard for
correctness of replay.

As I wrote, all it takes is to issue an explicit zero/punch command
in the beginning of replay halpers. Just need to find the command
that works correctly and most efficiently with thinp.

If you have the time to do that (since I believe you already tested
some commands) that would be great. Otherwise, I'll do that later.

Thanks,
Amir.




[Index of Archives]     [Linux Filesystems Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux