On Wed, Jan 29, 2025 at 05:06:52PM +1100, Dave Chinner wrote: > On Tue, Jan 28, 2025 at 07:13:13PM -0800, Darrick J. Wong wrote: > > On Wed, Jan 29, 2025 at 07:39:22AM +1100, Dave Chinner wrote: > > > On Mon, Jan 27, 2025 at 11:23:52PM -0800, Darrick J. Wong wrote: > > > > On Tue, Jan 28, 2025 at 03:34:50PM +1100, Dave Chinner wrote: > > > > > On Thu, Jan 23, 2025 at 12:16:50PM +1100, Dave Chinner wrote: > > > > > 4. /tmp is still shared across all runner instances so all the > > > > > > > > > > concurrent runners dump all their tmp files in /tmp. However, the > > > > > runners no longer have unique PIDs (i.e. check runs as PID 3 in > > > > > all runner instaces). This means using /tmp/tmp.$$ as the > > > > > check/test temp file definition results is instant tmp file name > > > > > collisions and random things in check and tests fail. check and > > > > > common/preamble have to be converted to use `mktemp` to provide > > > > > unique tempfile name prefixes again. > > > > > > > > > > 5. Don't forget to revert the parent /proc mount back to shared > > > > > after check has finished running (or was aborted). > > > > > > > > > > I think with this (current prototype patch below), we can use PID > > > > > namespaces rather than process session IDs for check-parallel safe > > > > > process management. > > > > > > > > > > Thoughts? > > > > > > > > Was about to go to bed, but can we also start a new mount namespace, > > > > create a private (or at least non-global) /tmp to put files into, and > > > > then each test instance is isolated from clobbering the /tmpfiles of > > > > other ./check instances *and* the rest of the system? > > > > > > We probably can. I didn't want to go down that rat hole straight > > > away, because then I'd have to make a decision about what to mount > > > there. One thing at a time.... > > > > > > I suspect that I can just use a tmpfs filesystem for it - there's > > > heaps of memory available on my test machines and we don't use /tmp > > > to hold large files, so that should work fine for me. However, I'm > > > a little concerned about what will happen when testing under memory > > > pressure situations if /tmp needs memory to operate correctly. > > > > > > I'll have a look at what is needed for private tmpfs /tmp instances > > > to work - it should work just fine. > > > > > > However, if check-parallel has taught me anything, it is that trying > > > to use "should work" features on a modern system tends to mean "this > > > is a poorly documented rat-hole that with many dead-ends that will > > > be explored before a working solution is found"... > > > > <nod> I'm running an experiment overnight with the following patch to > > get rid of the session id grossness. AFAICT it can also be used by > > check-parallel to start its subprocesses in separate pid namespaces, > > though I didn't investigate thoroughly. > > I don't think check-parallel needs to start each check instance in > it's own PID namespace - it's the tests themselves that need the > isolation from each other. > > However, the check instances require a private mount namespace, as > they mount and unmount test/scratch devices themselves and we do not > want other check instances seeing those mounts. > > Hence I think the current check-parallel code doing mount namespace > isolation as it already does should work with this patch enabling > per-test process isolation inside check itself. > > > I'm also not sure it's required for check-helper to unmount the /proc > > that it creates; merely exiting seems to clean everything up? <shrug> > > Yeah, I think tearing down the mount namespace (i.e. exiting the > process that nsexec created) drops the last active reference to the > mounts inside the private namespace and so it gets torn down that > way. > > So from my perspective, I think your check-helper namespace patch is > a good improvement and I'll build/fix anything I come across on top > of it. Once your series of fixes goes in, I'll rebase all the stuff > I've got on top it and go from there... <nod> I might reformulate the pkill code to use nsexec (and not systemd) if it's available; systemd scopes if those are available (I figured out how to get systemd to tell me the cgroup name); or worst case fall back to process sessions if neither are available. I don't know how ancient of a userspace we realistically have to support since (afaict) namespaces and systemd both showed up around the 2.6.24 era? But I also don't know if Devuan at least does pid/mount namespaces. --D > > I also tried using systemd-nspawn to run fstests inside a "container" > > but very quickly discovered that you can't pass block devices to the > > container which makes fstests pretty useless for testing real scsi > > devices. :/ > > Yet another dead-end in the poorly sign-posted rat-hole, eh? Yup. --D > -Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx