On Tue, Jan 28, 2025 at 07:13:13PM -0800, Darrick J. Wong wrote: > On Wed, Jan 29, 2025 at 07:39:22AM +1100, Dave Chinner wrote: > > On Mon, Jan 27, 2025 at 11:23:52PM -0800, Darrick J. Wong wrote: > > > On Tue, Jan 28, 2025 at 03:34:50PM +1100, Dave Chinner wrote: > > > > On Thu, Jan 23, 2025 at 12:16:50PM +1100, Dave Chinner wrote: > > > > 4. /tmp is still shared across all runner instances so all the > > > > > > > > concurrent runners dump all their tmp files in /tmp. However, the > > > > runners no longer have unique PIDs (i.e. check runs as PID 3 in > > > > all runner instaces). This means using /tmp/tmp.$$ as the > > > > check/test temp file definition results is instant tmp file name > > > > collisions and random things in check and tests fail. check and > > > > common/preamble have to be converted to use `mktemp` to provide > > > > unique tempfile name prefixes again. > > > > > > > > 5. Don't forget to revert the parent /proc mount back to shared > > > > after check has finished running (or was aborted). > > > > > > > > I think with this (current prototype patch below), we can use PID > > > > namespaces rather than process session IDs for check-parallel safe > > > > process management. > > > > > > > > Thoughts? > > > > > > Was about to go to bed, but can we also start a new mount namespace, > > > create a private (or at least non-global) /tmp to put files into, and > > > then each test instance is isolated from clobbering the /tmpfiles of > > > other ./check instances *and* the rest of the system? > > > > We probably can. I didn't want to go down that rat hole straight > > away, because then I'd have to make a decision about what to mount > > there. One thing at a time.... > > > > I suspect that I can just use a tmpfs filesystem for it - there's > > heaps of memory available on my test machines and we don't use /tmp > > to hold large files, so that should work fine for me. However, I'm > > a little concerned about what will happen when testing under memory > > pressure situations if /tmp needs memory to operate correctly. > > > > I'll have a look at what is needed for private tmpfs /tmp instances > > to work - it should work just fine. > > > > However, if check-parallel has taught me anything, it is that trying > > to use "should work" features on a modern system tends to mean "this > > is a poorly documented rat-hole that with many dead-ends that will > > be explored before a working solution is found"... > > <nod> I'm running an experiment overnight with the following patch to > get rid of the session id grossness. AFAICT it can also be used by > check-parallel to start its subprocesses in separate pid namespaces, > though I didn't investigate thoroughly. I don't think check-parallel needs to start each check instance in it's own PID namespace - it's the tests themselves that need the isolation from each other. However, the check instances require a private mount namespace, as they mount and unmount test/scratch devices themselves and we do not want other check instances seeing those mounts. Hence I think the current check-parallel code doing mount namespace isolation as it already does should work with this patch enabling per-test process isolation inside check itself. > I'm also not sure it's required for check-helper to unmount the /proc > that it creates; merely exiting seems to clean everything up? <shrug> Yeah, I think tearing down the mount namespace (i.e. exiting the process that nsexec created) drops the last active reference to the mounts inside the private namespace and so it gets torn down that way. So from my perspective, I think your check-helper namespace patch is a good improvement and I'll build/fix anything I come across on top of it. Once your series of fixes goes in, I'll rebase all the stuff I've got on top it and go from there... > I also tried using systemd-nspawn to run fstests inside a "container" > but very quickly discovered that you can't pass block devices to the > container which makes fstests pretty useless for testing real scsi > devices. :/ Yet another dead-end in the poorly sign-posted rat-hole, eh? -Dave. -- Dave Chinner david@xxxxxxxxxxxxx