Re: [LSF/MM TOPIC] FS, MM, and stable trees

Luis Chamberlain <mcgrof@xxxxxxxxxx> · Fri, 11 Mar 2022 12:52:41 -0800

On Fri, Mar 11, 2022 at 12:23:55AM -0500, Theodore Ts'o wrote:
> On Wed, Mar 09, 2022 at 10:57:24AM -0800, Luis Chamberlain wrote:
> > On Tue, Mar 08, 2022 at 02:06:57PM -0500, Sasha Levin wrote:
> > > What we can't do is invest significant time into doing the testing work
> > > ourselves for each and every subsystem in the kernel.
> > 
> > I think this experience helps though, it gives you I think a better
> > appreciation for what concerns we have to merge any fix and the effort
> > and dilligence required to ensure we don't regress. I think the
> > kernel-ci steady state goal takes this a bit further.
> 
> Different communities seem to have different goals that they believe
> the stable kernels should be aiming for.  Sure, if you never merge any
> fix, you can guarantee that there will be no regressions.  However,
> the question is whether the result is a better quality kernel.  For
> example, there is a recent change to XFS which fixes a security bug
> which allows an attacker to gain access to deleted data.  How do you
> balance the tradeoff of "no regressions, ever", versus, "we'll leave a
> security bug in XFS which is fixed in mainline linux, but we fear
> regressions so much that we won't even backport a single-line fix to
> the stable kernel?"

That patch should just be applied, thanks for the heads up, I'll go try
to spin some resources to test if this is not merged already.

And perhaps in such cases the KERNEL_CI_STEADY_STATE_GOAL can be reduced.

> In my view, the service which Greg, Sasha and the other stable
> maintainers provide is super-valuable, and I am happy that ext4
> changes are automatically cherry-picked into the stable kernel.  Have
> there been times when this has resulted in regressions in ext4 for the
> stable kernel?  Sure!  It's only been a handful of a times, though,
> and the number of bug fixes that users using stable kernels have _not_
> seen *far* outweighs the downsides of the occasional regressions
> (which gets found and then reverted).

I think by now the community should know I'm probably one of the biggest
advocates of kernel automation. Whether that be kernel testing or kernel
code generation... the reason I've started dabbling into the automation
part of testing is that they go hand in hand. So while I value the
stable process, I think it should be respected if subsystems with a
higher threshold than others for testing / review is kept.

The only way to move forward with enabling more automation for kernel
code integration is through better and improved kernel test automation.
And it is *exactly* why I've been working so hard on that problem.

> > Perhaps the one area that might interest folks is the test setup,
> > using loopback drives and truncated files, if you find holes in
> > this please let me know:
> > 
> > https://github.com/mcgrof/kdevops/blob/master/docs/testing-with-loopback.md
> > 
> > In my experience this setup just finds *more* issues, rather than less,
> > and in my experience as well none of these issues found were bogus, they
> > always lead to real bugs:
> > 
> > https://github.com/mcgrof/kdevops/blob/master/docs/seeing-more-issues.md
> 
> Different storage devices --- Google Cloud Persistent Disks, versus
> single spindle HDD's, SSD's,

<-- Insert tons of variability requirements on test drives -->
<-- Insert tons of variability requirements on confidence in testing -->
<-- Insert tons of variability requirements on price / cost assessment -->
<-- Insert tons of variability requirements on business case -->

What you left out in terms of variability was you use GCE, and yes
others will want to use AWS, OpenStack, etc as well. So that's another
variability aspect too.

What's the common theme here? Variability!

And what is the most respectable modeling variability language? Kconfig!

It is why the way I've designed kdevops was to embrace kconfig. It
enables you to test however you want, using whatever test devices,
with whatever test criteria you might have and on any cloud or local
virt solution.

Yes, some of the variability things in kdevops are applicable only to
kdevops, but since I picked up kconfig it meant I also adopted it for
variability for fstest and blktests. It should be possible to move that
to fstests / blktests if we wanted to, and for kdevops to just use it.

And if you are thinking:

   why shucks.. but I don't want to deal with the complexity of
   integrating kconfig into a new project. That sounds difficult.

Yes I hear you, and to help with that I've created a git tree which can
be used as a git subtree (note: different than the stupid git
submodules) to let you easily integrate kconfig adoption into any
project with only a few lines of code differences:

https://github.com/mcgrof/kconfig

Also let's recall that just because you have your own test framework
it does not mean we could not benefit from others testing our
filesystems on their own silly hardware at home as well. Yes tons
of projects can be used which wrap fstests... but I never found one
as easy to use as compiling the kernel and running a few make commands.
So my goal was not just addressing the variability aspect for fstests
and blktests, but also enabling the average user to also easily help
test as well.

There is the concept of results too and a possible way to share things..
but this is getting a bit off topic and I don't want to bore people more.

  Luis