On Tue, Mar 20, 2018 at 08:00:24AM -0400, Brian Foster wrote: > On Tue, Mar 20, 2018 at 04:00:20PM +1100, Dave Chinner wrote: > > From: Dave Chinner <dchinner@xxxxxxxxxx> > > > > Similar to log_recovery_delay, this delay occurs between the VFS > > superblock being initialised and the xfs_mount being fully > > initialised. It also poisons the per-ag radix tree node so that it > > can be used for triggering shrinker races during mount > > such as the following: > > > > <run memory pressure workload in background> > > > > $ cat dirty-mount.sh > > #! /bin/bash > > > > umount -f /dev/pmem0 > > mkfs.xfs -f /dev/pmem0 > > mount /dev/pmem0 /mnt/test > > rm -f /mnt/test/foo > > xfs_io -fxc "pwrite 0 4k" -c fsync -c "shutdown" /mnt/test/foo > > umount /dev/pmem0 > > > > # let's crash it now! > > echo 30 > /sys/fs/xfs/debug/mount_delay > > mount /dev/pmem0 /mnt/test > > echo 0 > /sys/fs/xfs/debug/mount_delay > > umount /dev/pmem0 > > $ sudo ./dirty-mount.sh > > ..... > > Planning to post a test for this? Haven't written one yet, only really had time to diagnose and write the fix so far. > > > [ 60.378118] CPU: 3 PID: 3577 Comm: fs_mark Tainted: G D W 4.16.0-rc5-dgc #440 > > [ 60.378120] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 > > [ 60.378124] RIP: 0010:radix_tree_next_chunk+0x76/0x320 > > [ 60.378127] RSP: 0018:ffffc9000276f4f8 EFLAGS: 00010282 > > [ 60.383670] RAX: a5a5a5a5a5a5a5a4 RBX: 0000000000000010 RCX: 000000000000001a > > [ 60.385277] RDX: 0000000000000000 RSI: ffffc9000276f540 RDI: 0000000000000000 > > [ 60.386554] RBP: 0000000000000000 R08: 0000000000000000 R09: a5a5a5a5a5a5a5a5 > > [ 60.388194] R10: 0000000000000006 R11: 0000000000000001 R12: ffffc9000276f598 > > [ 60.389288] R13: 0000000000000040 R14: 0000000000000228 R15: ffff880816cd6458 > > [ 60.390827] FS: 00007f5c124b9740(0000) GS:ffff88083fc00000(0000) knlGS:0000000000000000 > > [ 60.392253] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [ 60.393423] CR2: 00007f5c11bba0b8 CR3: 000000035580e001 CR4: 00000000000606e0 > > Was the beginning of this error splat snipped out? It might be useful to > include that and perhaps instead snip out some of the specific register > context above. Otherwise looks fine: It was one of about 100 threads that smashed into the shrinker at the same time. It was the most intact trace I could cut and paste, and the actual oops lines were nowhere to be seen on the console output.... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html