Re: [PATCH 1/3] MAINTAINERS: Introduce V: field for required tests

"Theodore Ts'o" <tytso@xxxxxxx> · Tue, 21 Nov 2023 01:04:50 -0500

On Mon, Nov 20, 2023 at 10:27:33PM +0000, Mark Brown wrote:
> This is the sort of thing that kcidb (which Nikolai works on) is good at
> ingesting, I actually do push all my CI's test results into there
> already:
> 
>    https://github.com/kernelci/kcidb/
> 
> (the dashboard is down currently.)  A few other projects including the
> current KernelCI and RedHat's CKI push their data in there too, I'm sure
> Nikolai would be delighted to get more people pushing data in.  The goal
> is to merge this with the main KernelCI infrastructure, it's currently
> separate while people figure out the whole big data thing.

Looking at the kernelci, it appears that it's using a JSON submission
format.  Is there conversion scripts that take a KTAP test report, or
a Junit XML test report?

> The KernelCI LF project is funding kcidb with precisely this goal for
> the reasons you outline, the data collection part seems to be relatively
> mature at this point but AIUI there's a bunch of open questions with the
> analysis and usage side, partly due to needing to find people to work on
> it.

Indeed, this is the super hard part.  Having looked at the kernelci
web site, its dashboard isn't particularly useful for what I'm trying
to do with it.  For my part, when analyizing a single test run, the
kernelci dashboard isn't particularly helpful.  What I need is
something more like this:

ext4/4k: 554 tests, 48 skipped, 4301 seconds
ext4/1k: 550 tests, 3 failures, 51 skipped, 6739 seconds
  Failures: generic/051 generic/475 generic/476
ext4/ext3: 546 tests, 138 skipped, 4239 seconds
ext4/encrypt: 532 tests, 3 failures, 159 skipped, 3218 seconds
  Failures: generic/681 generic/682 generic/691
ext4/nojournal: 549 tests, 3 failures, 118 skipped, 4477 seconds
  Failures: ext4/301 ext4/304 generic/455
ext4/ext3conv: 551 tests, 49 skipped, 4655 seconds
ext4/adv: 551 tests, 4 failures, 56 skipped, 4987 seconds
  Failures: generic/477 generic/506
  Flaky: generic/269: 40% (2/5)   generic/455: 40% (2/5)
ext4/dioread_nolock: 552 tests, 48 skipped, 4538 seconds
ext4/data_journal: 550 tests, 2 failures, 120 skipped, 4401 seconds
  Failures: generic/455 generic/484
ext4/bigalloc_4k: 526 tests, 53 skipped, 4537 seconds
ext4/bigalloc_1k: 526 tests, 61 skipped, 4847 seconds
ext4/dax: 541 tests, 1 failures, 152 skipped, 3069 seconds
  Flaky: generic/269: 60% (3/5)
Totals: 6592 tests, 1053 skipped, 72 failures, 0 errors, 50577s

... which summarizes 6,592 tests in 20 lines, and for any test that
has failed, we rerun it four more times, so we can get an indication
of whether a test is a hard failure, or a flaky failure.

(I don't need to see all of the tests that passes; it's the test
failures or the test flakes that are significant.)

And then when comparing between multiple test runs, that's when I'm
interesting in see which tests may have regressed, or which tests may
have been fixed when going in between version A and version B.

And right now, kernelci doesn't have any of that.  So it might be hard
to convinced overloaded maintainers to upload test runs to kernelci,
when they don't see any immediate benefit of uploading the kernelci db.

There is a bit of a chicken-and-egg problem, since without the test
results getting uploaded, it's hard to get the analysis functionality
implemented, and without the analysis features, it's hard to get
developers to upload the data.

That being said, a number of file system developers probably have
several years worth of test results that we could probably give you.
I have hundreds of junit.xml files, with information about how kernel
version, what version of xfstesets, etc, that was used.  I'm happy to
make samples of it available for anyone who is interested.

Cheers,

						- Ted