On Tue, Jan 21, 2025 at 04:03:23PM +1100, Dave Chinner wrote: > On Thu, Jan 16, 2025 at 03:28:49PM -0800, Darrick J. Wong wrote: > > From: Darrick J. Wong <djwong@xxxxxxxxxx> > > > > generic/032 now periodically fails with: > > > > --- /tmp/fstests/tests/generic/032.out 2025-01-05 11:42:14.427388698 -0800 > > +++ /var/tmp/fstests/generic/032.out.bad 2025-01-06 18:20:17.122818195 -0800 > > @@ -1,5 +1,7 @@ > > QA output created by 032 > > 100 iterations > > -000000 cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd >................< > > -* > > -100000 > > +umount: /opt: target is busy. > > +mount: /opt: /dev/sda4 already mounted on /opt. > > + dmesg(1) may have more information after failed mount system call. > > +cycle mount failed > > +(see /var/tmp/fstests/generic/032.full for details) > > > > The root cause of this regression is the _syncloop subshell. This > > background process runs _scratch_sync, which is actually an xfs_io > > process that calls syncfs on the scratch mount. > > > > Unfortunately, while the test kills the _syncloop subshell, it doesn't > > actually kill the xfs_io process. If the xfs_io process is in D state > > running the syncfs, it won't react to the signal, but it will pin the > > mount. Then the _scratch_cycle_mount fails because the mount is pinned. > > > > Prior to commit 8973af00ec212f the _syncloop ran sync(1) which avoided > > pinning the scratch filesystem. > > How does running sync(1) prevent this? they run the same kernel > code, so I'm a little confused as to why this is a problem caused > by using the syncfs() syscall rather than the sync() syscall... Instead of: _scratch_sync -> _sync_fs $SCRATCH_MNT -> $XFS_IO_PROG -rxc "syncfs" $SCRATCH_MNT sync(1) just calls sync(2) with no open files other than std{in,out,err}. --D > > Fix this by pgrepping for the xfs_io process and killing and waiting for > > it if necessary. > > Change looks fine, though. > > Reviewed-by: Dave Chinner <dchinner@xxxxxxxxxx> > > -- > Dave Chinner > david@xxxxxxxxxxxxx >