Re: [PATCH 08/23] common: fix pkill by running test program in a separate session

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jan 29, 2025 at 05:06:52PM +1100, Dave Chinner wrote:
> On Tue, Jan 28, 2025 at 07:13:13PM -0800, Darrick J. Wong wrote:
> > On Wed, Jan 29, 2025 at 07:39:22AM +1100, Dave Chinner wrote:
> > > On Mon, Jan 27, 2025 at 11:23:52PM -0800, Darrick J. Wong wrote:
> > > > On Tue, Jan 28, 2025 at 03:34:50PM +1100, Dave Chinner wrote:
> > > > > On Thu, Jan 23, 2025 at 12:16:50PM +1100, Dave Chinner wrote:
> > > > > 4. /tmp is still shared across all runner instances so all the
> > > > > 
> > > > >    concurrent runners dump all their tmp files in /tmp. However, the
> > > > >    runners no longer have unique PIDs (i.e. check runs as PID 3 in
> > > > >    all runner instaces). This means using /tmp/tmp.$$ as the
> > > > >    check/test temp file definition results is instant tmp file name
> > > > >    collisions and random things in check and tests fail.  check and
> > > > >    common/preamble have to be converted to use `mktemp` to provide
> > > > >    unique tempfile name prefixes again.
> > > > > 
> > > > > 5. Don't forget to revert the parent /proc mount back to shared
> > > > >    after check has finished running (or was aborted).
> > > > > 
> > > > > I think with this (current prototype patch below), we can use PID
> > > > > namespaces rather than process session IDs for check-parallel safe
> > > > > process management.
> > > > > 
> > > > > Thoughts?
> > > > 
> > > > Was about to go to bed, but can we also start a new mount namespace,
> > > > create a private (or at least non-global) /tmp to put files into, and
> > > > then each test instance is isolated from clobbering the /tmpfiles of
> > > > other ./check instances *and* the rest of the system?
> > > 
> > > We probably can. I didn't want to go down that rat hole straight
> > > away, because then I'd have to make a decision about what to mount
> > > there. One thing at a time....
> > > 
> > > I suspect that I can just use a tmpfs filesystem for it - there's
> > > heaps of memory available on my test machines and we don't use /tmp
> > > to hold large files, so that should work fine for me.  However, I'm
> > > a little concerned about what will happen when testing under memory
> > > pressure situations if /tmp needs memory to operate correctly.
> > > 
> > > I'll have a look at what is needed for private tmpfs /tmp instances
> > > to work - it should work just fine.
> > > 
> > > However, if check-parallel has taught me anything, it is that trying
> > > to use "should work" features on a modern system tends to mean "this
> > > is a poorly documented rat-hole that with many dead-ends that will
> > > be explored before a working solution is found"...
> > 
> > <nod> I'm running an experiment overnight with the following patch to
> > get rid of the session id grossness.  AFAICT it can also be used by
> > check-parallel to start its subprocesses in separate pid namespaces,
> > though I didn't investigate thoroughly.
> 
> I don't think check-parallel needs to start each check instance in
> it's own PID namespace - it's the tests themselves that need the
> isolation from each other.
> 
> However, the check instances require a private mount namespace, as
> they mount and unmount test/scratch devices themselves and we do not
> want other check instances seeing those mounts.
> 
> Hence I think the current check-parallel code doing mount namespace
> isolation as it already does should work with this patch enabling
> per-test process isolation inside check itself.
> 
> > I'm also not sure it's required for check-helper to unmount the /proc
> > that it creates; merely exiting seems to clean everything up? <shrug>
> 
> Yeah, I think tearing down the mount namespace (i.e. exiting the
> process that nsexec created) drops the last active reference to the
> mounts inside the private namespace and so it gets torn down that
> way.
> 
> So from my perspective, I think your check-helper namespace patch is
> a good improvement and I'll build/fix anything I come across on top
> of it. Once your series of fixes goes in, I'll rebase all the stuff
> I've got on top it and go from there...

<nod> I might reformulate the pkill code to use nsexec (and not systemd)
if it's available; systemd scopes if those are available (I figured out
how to get systemd to tell me the cgroup name); or worst case fall back
to process sessions if neither are available.

I don't know how ancient of a userspace we realistically have to support
since (afaict) namespaces and systemd both showed up around the 2.6.24
era?  But I also don't know if Devuan at least does pid/mount
namespaces.

--D

> > I also tried using systemd-nspawn to run fstests inside a "container"
> > but very quickly discovered that you can't pass block devices to the
> > container which makes fstests pretty useless for testing real scsi
> > devices. :/
> 
> Yet another dead-end in the poorly sign-posted rat-hole, eh?

Yup.

--D

> -Dave.
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux