Re: [GIT PULL] vboxsf fixes for 5.14-1

"Darrick J. Wong" <djwong@xxxxxxxxxx> · Tue, 10 Aug 2021 00:02:55 -0700

On Thu, Aug 05, 2021 at 03:48:36PM +0000, Konstantin Komarov wrote:
> > From: Darrick J. Wong <djwong@xxxxxxxxxx>
> > Sent: Wednesday, August 4, 2021 4:04 AM
> > To: Theodore Ts'o <tytso@xxxxxxx>
> > Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>; Matthew Wilcox <willy@xxxxxxxxxxxxx>; Leonidas P. Papadakos
> > <papadakospan@xxxxxxxxx>; Konstantin Komarov <almaz.alexandrovich@xxxxxxxxxxxxxxxxxxxx>; zajec5@xxxxxxxxx; Greg Kroah-
> > Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>; Hans de Goede <hdegoede@xxxxxxxxxx>; linux-fsdevel <linux-fsdevel@xxxxxxxxxxxxxxx>;
> > Linux Kernel Mailing List <linux-kernel@xxxxxxxxxxxxxxx>; Al Viro <viro@xxxxxxxxxxxxxxxxxx>
> > Subject: Re: [GIT PULL] vboxsf fixes for 5.14-1
> > 
> > On Tue, Aug 03, 2021 at 08:49:28PM -0400, Theodore Ts'o wrote:
> > > On Tue, Aug 03, 2021 at 05:10:22PM -0700, Linus Torvalds wrote:
> > > > The user-space FUSE thing does indeed work reasonably well.
> > > >
> > > > It performs horribly badly if you care about things like that, though.
> > > >
> > > > In fact, your own numbers kind of show that:
> > > >
> > > >   ntfs/default: 670 tests, 55 failures, 211 skipped, 34783 seconds
> > > >   ntfs3/default: 664 tests, 67 failures, 206 skipped, 8106 seconds
> > > >
> > > > and that's kind of the point of ntfs3.
> > >
> > > Sure, although if you run fstress in parallel ntfs3 will lock up, the
> > > system hard, and it has at least one lockdep deadlock complaints.
> > > It's not up to me, but personally, I'd feel better if *someone* at
> > > Paragon Software responded to Darrrick and my queries about their
> > > quality assurance, and/or made commitments that they would at least
> > > *try* to fix the problems that about 5 minutes of testing using
> > > fstests turned up trivially.
> > 
> > <cough> Yes, my aim was to gauge their interest in actively QAing the
> > driver's current problems so that it doesn't become one of the shabby
> > Linux filesystem drivers, like <cough>ntfs.
> > 
> > Note I didn't even ask for a particular percentage of passing tests,
> > because I already know that non-Unix filesystems fail the tests that
> > look for the more Unix-specific behaviors.
> > 
> > I really only wanted them to tell /us/ what the baseline is.  IMHO the
> > silence from them is a lot more telling.  Both generic/013 and
> > generic/475 are basic "try to create files and read and write data to
> > them" exercisers; failing those is a red flag.
> > 
> 
> Hi Darrick and Theodore! First of all, apologies for the silence on your questions.
> Let me please clarify and summarize the QA topic for you.
> 
> The main thing to outline is that: we have the number of autotests executed
> for ntfs3 code. More specifically, we are using TeamCity as our CI tool, which
> is handling autotests. Those are being executed against each commit to the
> ntfs3 codebase.
> 
> Autotests are divided into the "promotion" levels, which are quite standard:
> L0, L1, L2. Those levels have the division from the shortest "smoke" (L0)
> to the longest set (L2). This we need to cover the ntfs3 functionality with
> tests under given amount of time (feedback loop for L0 is minutes, while for
> L2 is up to 24hrs).

Sounds comparable to my setup, which has these tiers:

fstests -g quick (~45 minutes) on fast ssds
fstests -g all (~3 hours) on fast ssds
fstests -g all (~12 hours) on slow(er) cheap(er) cloud storage
fstests -g long_soak (~7 days) on aging ssds

(There's also the fifth tier which is spawning dozens of VMs to fuzz
test, but I don't have enough spare time to run that and triage the
results on a regular basis.)

> As for suites we are using - it is the mix of open/well known suites:
> - xfstests, ltp, pjd suite, fsx, dirstress, fstorture - those are of known utilites/suites
> And number of internal autotests which were developed for covering various parts of
> fs specs, regression autotests which are introduced to the infrastructure after bugfixes
> and autotests written to test the driver operation on various data sets.
> 
> This approach is settled in Paragon for years, and ntfs3, from the first line of code written,
> is being developed this way. You may refer the artifacts linked below, where the progress/coverage
> during the last year is spoken by autotest results:
> 
> the 27th patch-series code (July'2021): 
> https://dl.paragon-software.com/ntfs3/p27_tests.tar
> 25th (March'2021):
> https://dl.paragon-software.com/ntfs3/p25_tests.tar
> 2nd (August, 2020):
> https://dl.paragon-software.com/ntfs3/p2_tests.tar
> 
> Those are results on ntfs3 ran within the 'linux-next' (the most recent one given the tests start date)
> As may be observed, we never skipped the "tests day" :)
> 
> There is a note should be provided on xfstests specifically. We have been using this suite
> as a part of our autotests for several years already. However the suite originate for Linux
> native file systems and a lot of cases are not applicable to the NTFS. This is one of the reasons
> why some of "red-flag" failures are there (e.g. generic/475) - they were excluded at some point of time
> and we've missed to enable it back when it was the time :)

generic/019, generic/317, generic/476 (and generic/521 and 522) are
supposed to be stress exercisers of standard functionality (read, write,
mkdir, creat, unlink) which means they ought to pass on any filesystem.

Hmm, we /dont/ have a tag for these generic exercisers.  Maybe we
should, I'll think about that.

(FWIW a minor correction: I meant to reference generic/476 and not 475,
because 475 is a looping test of crash recovery.  475 is notorious for
intermittent failures even on ext4/btrfs/xfs.)

> Thank you all for this effort to run and look closer on our code, on the next patchset, the
> 91, 317 and 475 should be resolved. And now we are looking up to other excluded tests to find out more of such.

Ok, thank you!

> Hope this will resolve some of your concerns.

It does. :)

--D

> > --D
> > 
> > > I can even give them patches and configsto make it trivially easy for
> > > them to run fstests using KVM or GCE....
> > >
> > > 				- Ted