Re: [PATCH 08/23] common: fix pkill by running test program in a separate session

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jan 28, 2025 at 07:13:13PM -0800, Darrick J. Wong wrote:
> On Wed, Jan 29, 2025 at 07:39:22AM +1100, Dave Chinner wrote:
> > On Mon, Jan 27, 2025 at 11:23:52PM -0800, Darrick J. Wong wrote:
> > > On Tue, Jan 28, 2025 at 03:34:50PM +1100, Dave Chinner wrote:
> > > > On Thu, Jan 23, 2025 at 12:16:50PM +1100, Dave Chinner wrote:
> > > > 4. /tmp is still shared across all runner instances so all the
> > > > 
> > > >    concurrent runners dump all their tmp files in /tmp. However, the
> > > >    runners no longer have unique PIDs (i.e. check runs as PID 3 in
> > > >    all runner instaces). This means using /tmp/tmp.$$ as the
> > > >    check/test temp file definition results is instant tmp file name
> > > >    collisions and random things in check and tests fail.  check and
> > > >    common/preamble have to be converted to use `mktemp` to provide
> > > >    unique tempfile name prefixes again.
> > > > 
> > > > 5. Don't forget to revert the parent /proc mount back to shared
> > > >    after check has finished running (or was aborted).
> > > > 
> > > > I think with this (current prototype patch below), we can use PID
> > > > namespaces rather than process session IDs for check-parallel safe
> > > > process management.
> > > > 
> > > > Thoughts?
> > > 
> > > Was about to go to bed, but can we also start a new mount namespace,
> > > create a private (or at least non-global) /tmp to put files into, and
> > > then each test instance is isolated from clobbering the /tmpfiles of
> > > other ./check instances *and* the rest of the system?
> > 
> > We probably can. I didn't want to go down that rat hole straight
> > away, because then I'd have to make a decision about what to mount
> > there. One thing at a time....
> > 
> > I suspect that I can just use a tmpfs filesystem for it - there's
> > heaps of memory available on my test machines and we don't use /tmp
> > to hold large files, so that should work fine for me.  However, I'm
> > a little concerned about what will happen when testing under memory
> > pressure situations if /tmp needs memory to operate correctly.
> > 
> > I'll have a look at what is needed for private tmpfs /tmp instances
> > to work - it should work just fine.
> > 
> > However, if check-parallel has taught me anything, it is that trying
> > to use "should work" features on a modern system tends to mean "this
> > is a poorly documented rat-hole that with many dead-ends that will
> > be explored before a working solution is found"...
> 
> <nod> I'm running an experiment overnight with the following patch to
> get rid of the session id grossness.  AFAICT it can also be used by
> check-parallel to start its subprocesses in separate pid namespaces,
> though I didn't investigate thoroughly.

I don't think check-parallel needs to start each check instance in
it's own PID namespace - it's the tests themselves that need the
isolation from each other.

However, the check instances require a private mount namespace, as
they mount and unmount test/scratch devices themselves and we do not
want other check instances seeing those mounts.

Hence I think the current check-parallel code doing mount namespace
isolation as it already does should work with this patch enabling
per-test process isolation inside check itself.

> I'm also not sure it's required for check-helper to unmount the /proc
> that it creates; merely exiting seems to clean everything up? <shrug>

Yeah, I think tearing down the mount namespace (i.e. exiting the
process that nsexec created) drops the last active reference to the
mounts inside the private namespace and so it gets torn down that
way.

So from my perspective, I think your check-helper namespace patch is
a good improvement and I'll build/fix anything I come across on top
of it. Once your series of fixes goes in, I'll rebase all the stuff
I've got on top it and go from there...

> I also tried using systemd-nspawn to run fstests inside a "container"
> but very quickly discovered that you can't pass block devices to the
> container which makes fstests pretty useless for testing real scsi
> devices. :/

Yet another dead-end in the poorly sign-posted rat-hole, eh?

-Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux