Re: [PATCH v2 1/2] check: disable HAVE_PRIVATENS by default

Dave Chinner <david@xxxxxxxxxxxxx> · Wed, 12 Mar 2025 08:19:25 +1100

On Fri, Mar 07, 2025 at 11:47:22PM +0800, Zorro Lang wrote:
> On Fri, Mar 07, 2025 at 07:56:28AM +1100, Dave Chinner wrote:
> > I'm talking about removing all the setsid and namspace bits that we
> > added to try to solve the need for pkill --parent. i.e. this series
> > of patches:
> > 
> > 949bdf8ea check: deprecate using process sessions to isolate test instances
> > 247ab01fa check: run tests in a private pid/mount namespace
> > 88d60f434 common: fix pkill by running test program in a separate session
> > c71349150 common/rc: hoist pkill to a helper function
> > 77548e606 common/rc: create a wrapper for the su command
> > 
> > Using private PID namespaces in check-parallel has solved the
> > original problem that required the pkill --parent usage and the
> > fsstress binary rename/copy workaround by wrapping process isolation
> > outside of each check instance. Hence check and the internal
> > check infrastructure doesn't need to ever care about process
> > isolation to solve this check-parallel problem anymore.
> 
> If "check" doesn't need these namespace isolation things, that will be good
> to me. That'll help to isolate the issues on check or on check-parallel.
> And we won't bring in regressions to check when try to fix check-parallel.
> 
> Now let me merge this patch at first, to avoid the regressions on btrfs.
> Then please feel free to send patches to remove/reduce the coupling of
> two testing ways (check and check-parallel:)

Sure.

FWIW, it's not necessary for you to test whether the merged version
of check-parallel runs correctly right now. If some other change
breaks it, I will fix it and send patches to fix it. I'm much more
concerned that changes made to support check-parallel don't break
check...

> > i.e. the isolation fix should have been made to check-parallel,
> > not to check.
> > 
> > It is not ideal to have to walk all this stuff back, but this is
> > what happens when an RFC-scope feature prototype is merged without
> > allowing adequate time for review or give the author a chance to
> > address at least one round of review comments and bug fixes before
> > it gets merged....
> 
> Sorry, I've learned from this. I thought I've given it lots of test,
> and fixed all bug reports at that time. And I asked several times
> if there's any objection about merging it in next release. As there's
> not objection from anyone in 2 weeks, so I decided to give it a chance,
> to save the super complicated rebasing workload...

Sure, not having anyone review a prototype after a couple of
weeks is pretty common - it does not mean it is ready for merge.
It typically means that the author will then post a newer, more
complete version when they are ready, and then maybe people will
look at it.

Hence my comment on where I wanted people to focus their review -
not on the check-parallel stuff, but on the generic infrastructure
changes (i.e. "get the process underway") because they were much more
important to get right than the check-parallel script.

IOWs, I was not asking anyone to bypass the normal cyclic
review-repost process - I was simply doing what ever experienced
developer does when posting a large patchset: guiding reviewers to
the important areas of the patch set to focus on first.

> Anyway, next time,
> for each complex patchset on infrastructure, I'll always give it 3+
> release cycles time. And never merge it if without 3 acks from xfs,
> ext4 and btrfs,

And that, IMO, is swinging too far the other way. If the code is
complete and ready to go and nobody will review it, then the
above rule makes it impossible for the author to get their code
merged.

now we have the opposite problem: people have to carry large
infrastructure changes for a long time because they cannot get
anyone to review it.

The normal process is to let review-and-repost cycles occur
naturally (e.g. every couple of weeks for a few versions), with the
author indicating what the important things to focus review on in
the patchset are. When that cycle reaches it's end, a request for
merge will be made (e.g. a pull request or a repost with all patches
containing RVBs). This shouldn't take months, and if the reviewers
focus on the things that the author asks them to look at first then
the important things end up being ready for merge earlier rather
than later.

IOWs, there is no need for placing arbitrary bars to jump over if the
process is allowed to follow it's natural path. Problems typically
only arise when the process is not followed, so there needs to be a
very good reason for stepping outside normal processes or having
arbitrary rules that nobody really knows when/if they might
trigger...

-Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx