On Thu, Aug 05, 2021 at 03:48:36PM +0000, Konstantin Komarov wrote: > > From: Darrick J. Wong <djwong@xxxxxxxxxx> > > Sent: Wednesday, August 4, 2021 4:04 AM > > To: Theodore Ts'o <tytso@xxxxxxx> > > Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>; Matthew Wilcox <willy@xxxxxxxxxxxxx>; Leonidas P. Papadakos > > <papadakospan@xxxxxxxxx>; Konstantin Komarov <almaz.alexandrovich@xxxxxxxxxxxxxxxxxxxx>; zajec5@xxxxxxxxx; Greg Kroah- > > Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>; Hans de Goede <hdegoede@xxxxxxxxxx>; linux-fsdevel <linux-fsdevel@xxxxxxxxxxxxxxx>; > > Linux Kernel Mailing List <linux-kernel@xxxxxxxxxxxxxxx>; Al Viro <viro@xxxxxxxxxxxxxxxxxx> > > Subject: Re: [GIT PULL] vboxsf fixes for 5.14-1 > > > > On Tue, Aug 03, 2021 at 08:49:28PM -0400, Theodore Ts'o wrote: > > > On Tue, Aug 03, 2021 at 05:10:22PM -0700, Linus Torvalds wrote: > > > > The user-space FUSE thing does indeed work reasonably well. > > > > > > > > It performs horribly badly if you care about things like that, though. > > > > > > > > In fact, your own numbers kind of show that: > > > > > > > > ntfs/default: 670 tests, 55 failures, 211 skipped, 34783 seconds > > > > ntfs3/default: 664 tests, 67 failures, 206 skipped, 8106 seconds > > > > > > > > and that's kind of the point of ntfs3. > > > > > > Sure, although if you run fstress in parallel ntfs3 will lock up, the > > > system hard, and it has at least one lockdep deadlock complaints. > > > It's not up to me, but personally, I'd feel better if *someone* at > > > Paragon Software responded to Darrrick and my queries about their > > > quality assurance, and/or made commitments that they would at least > > > *try* to fix the problems that about 5 minutes of testing using > > > fstests turned up trivially. > > > > <cough> Yes, my aim was to gauge their interest in actively QAing the > > driver's current problems so that it doesn't become one of the shabby > > Linux filesystem drivers, like <cough>ntfs. > > > > Note I didn't even ask for a particular percentage of passing tests, > > because I already know that non-Unix filesystems fail the tests that > > look for the more Unix-specific behaviors. > > > > I really only wanted them to tell /us/ what the baseline is. IMHO the > > silence from them is a lot more telling. Both generic/013 and > > generic/475 are basic "try to create files and read and write data to > > them" exercisers; failing those is a red flag. > > > > Hi Darrick and Theodore! First of all, apologies for the silence on your questions. > Let me please clarify and summarize the QA topic for you. > > The main thing to outline is that: we have the number of autotests executed > for ntfs3 code. More specifically, we are using TeamCity as our CI tool, which > is handling autotests. Those are being executed against each commit to the > ntfs3 codebase. > > Autotests are divided into the "promotion" levels, which are quite standard: > L0, L1, L2. Those levels have the division from the shortest "smoke" (L0) > to the longest set (L2). This we need to cover the ntfs3 functionality with > tests under given amount of time (feedback loop for L0 is minutes, while for > L2 is up to 24hrs). Sounds comparable to my setup, which has these tiers: fstests -g quick (~45 minutes) on fast ssds fstests -g all (~3 hours) on fast ssds fstests -g all (~12 hours) on slow(er) cheap(er) cloud storage fstests -g long_soak (~7 days) on aging ssds (There's also the fifth tier which is spawning dozens of VMs to fuzz test, but I don't have enough spare time to run that and triage the results on a regular basis.) > As for suites we are using - it is the mix of open/well known suites: > - xfstests, ltp, pjd suite, fsx, dirstress, fstorture - those are of known utilites/suites > And number of internal autotests which were developed for covering various parts of > fs specs, regression autotests which are introduced to the infrastructure after bugfixes > and autotests written to test the driver operation on various data sets. > > This approach is settled in Paragon for years, and ntfs3, from the first line of code written, > is being developed this way. You may refer the artifacts linked below, where the progress/coverage > during the last year is spoken by autotest results: > > the 27th patch-series code (July'2021): > https://dl.paragon-software.com/ntfs3/p27_tests.tar > 25th (March'2021): > https://dl.paragon-software.com/ntfs3/p25_tests.tar > 2nd (August, 2020): > https://dl.paragon-software.com/ntfs3/p2_tests.tar > > Those are results on ntfs3 ran within the 'linux-next' (the most recent one given the tests start date) > As may be observed, we never skipped the "tests day" :) > > There is a note should be provided on xfstests specifically. We have been using this suite > as a part of our autotests for several years already. However the suite originate for Linux > native file systems and a lot of cases are not applicable to the NTFS. This is one of the reasons > why some of "red-flag" failures are there (e.g. generic/475) - they were excluded at some point of time > and we've missed to enable it back when it was the time :) generic/019, generic/317, generic/476 (and generic/521 and 522) are supposed to be stress exercisers of standard functionality (read, write, mkdir, creat, unlink) which means they ought to pass on any filesystem. Hmm, we /dont/ have a tag for these generic exercisers. Maybe we should, I'll think about that. (FWIW a minor correction: I meant to reference generic/476 and not 475, because 475 is a looping test of crash recovery. 475 is notorious for intermittent failures even on ext4/btrfs/xfs.) > Thank you all for this effort to run and look closer on our code, on the next patchset, the > 91, 317 and 475 should be resolved. And now we are looking up to other excluded tests to find out more of such. Ok, thank you! > Hope this will resolve some of your concerns. It does. :) --D > > --D > > > > > I can even give them patches and configsto make it trivially easy for > > > them to run fstests using KVM or GCE.... > > > > > > - Ted