Hi all, When running the 'ndctl' unit tests against 5.3-rc kernels, I noticed a frequent failure of the 'mmap.sh' test [1][2]. [1]: https://github.com/pmem/ndctl/blob/master/test/mmap.sh [2]: https://github.com/pmem/ndctl/blob/master/test/mmap.c But in trying to pare down the test further, I found that I can simply reproduce the problem by: mkfs.xfs -f /dev/pmem0 mount /dev/pmem0 /mnt/mem Where 'pmem0' is a legacy pmem namespace from reserved memory using the memmap= command line option. (Specifically, I have this: memmap=3G!6G,3G!9G ) The above mkfs/mount steps don't reproduce the problem a 100% of the time, but it does happen on my qemu based setup over 75% of the times. The kernel log shows the following when the mount fails: [Aug16 14:41] XFS (pmem0): Mounting V5 Filesystem [ +0.001856] XFS (pmem0): totally zeroed log [ +0.402616] XFS (pmem0): Internal error xlog_clear_stale_blocks(2) at line 1715 of file fs/xfs/xfs_log_recover.c. Caller xlog_find_tail+0x230/0x340 [xfs] [ +0.001741] CPU: 7 PID: 1771 Comm: mount Tainted: G O 5.2.0-rc4+ #112 [ +0.000976] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.1-0-g0551a4be2c-prebuilt.qemu-project.org 04/01/2014 [ +0.001516] Call Trace: [ +0.000351] dump_stack+0x85/0xc0 [ +0.000452] xlog_clear_stale_blocks+0x16d/0x180 [xfs] [ +0.000665] xlog_find_tail+0x230/0x340 [xfs] [ +0.000581] xlog_recover+0x2b/0x160 [xfs] [ +0.000554] xfs_log_mount+0x280/0x2a0 [xfs] [ +0.000561] xfs_mountfs+0x415/0x860 [xfs] [ +0.000533] ? xfs_mru_cache_create+0x18b/0x1f0 [xfs] [ +0.000665] xfs_fs_fill_super+0x4b0/0x700 [xfs] [ +0.000638] ? xfs_test_remount_options+0x60/0x60 [xfs] [ +0.000710] mount_bdev+0x17f/0x1b0 [ +0.000442] legacy_get_tree+0x30/0x50 [ +0.000467] vfs_get_tree+0x28/0xf0 [ +0.000436] do_mount+0x2d4/0xa00 [ +0.000411] ? memdup_user+0x3e/0x70 [ +0.000455] ksys_mount+0xba/0xd0 [ +0.000420] __x64_sys_mount+0x21/0x30 [ +0.000473] do_syscall_64+0x60/0x240 [ +0.000460] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ +0.000655] RIP: 0033:0x7f730fec91be [ +0.000506] Code: 48 8b 0d cd 1c 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 9a 1c 0c 00 f7 d8 64 89 01 48 [ +0.002305] RSP: 002b:00007ffdadbdb178 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 [ +0.000922] RAX: ffffffffffffffda RBX: 000055b8f9db8a40 RCX: 00007f730fec91be [ +0.000875] RDX: 000055b8f9dbfdb0 RSI: 000055b8f9dbb930 RDI: 000055b8f9db8c20 [ +0.000917] RBP: 00007f731007f1a4 R08: 0000000000000000 R09: 000055b8f9dc01f0 [ +0.000942] R10: 00000000c0ed0000 R11: 0000000000000246 R12: 0000000000000000 [ +0.000878] R13: 00000000c0ed0000 R14: 000055b8f9db8c20 R15: 000055b8f9dbfdb0 [ +0.000915] XFS (pmem0): failed to locate log tail [ +0.000622] XFS (pmem0): log mount/recovery failed: error -117 [ +0.012560] XFS (pmem0): log mount failed A bisect pointed to this commit: commit 6ad5b3255b9e3d6d94154738aacd5119bf9c8f6e (HEAD -> bisect-bad, refs/bisect/bad) Author: Christoph Hellwig <hch@xxxxxx> Date: Fri Jun 28 19:27:26 2019 -0700 xfs: use bios directly to read and write the log recovery buffers The xfs_buf structure is basically used as a glorified container for a memory allocation in the log recovery code. Replace it with a call to kmem_alloc_large and a simple abstraction to read into or write from it synchronously using chained bios. Signed-off-by: Christoph Hellwig <hch@xxxxxx> Reviewed-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx> Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx> Full bisect log follows at the end. I saw [3], but I can still easily hit the failure after manually applying that patch on the above commit. [3]: https://lore.kernel.org/linux-xfs/20190709152352.27465-1-hch@xxxxxx/ Any thoughts on what might be happening? I'd be happy to test out theories/patches. Thanks, -Vishal