On Tue, Dec 18, 2018 at 12:59:15PM -0600, Eric Sandeen wrote: > We've seen 2 reports of an oops on log replay after a swapext, which > look something like: > > [ 63.188907] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 > [ 63.196774] IP: [<ffffffffc093e5e6>] xfs_bmbt_init_cursor+0x46/0x180 [xfs] > ... > [ 63.309769] RIP: 0010:[<ffffffffc093e5e6>] [<ffffffffc093e5e6>] xfs_bmbt_init_cursor+0x46/0x180 [xfs] > [ 63.319132] RSP: 0018:ffff951b4d2977d0 EFLAGS: 00010282 > [ 63.324443] RAX: ffff951b4df63440 RBX: ffff951ba8a48000 RCX: 0000000000000000 > [ 63.331570] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff951b4df63518 > [ 63.338689] RBP: ffff951b4d2977f8 R08: 000000000001e2f0 R09: ffff951b4df63440 > [ 63.345818] R10: ffff951afa403800 R11: ffffffffffffffe0 R12: ffff951b4d0a3000 > [ 63.352949] R13: 0000000000000000 R14: 0000000000000000 R15: ffff951ba8a48040 > [ 63.360079] FS: 00007f72bcd1f880(0000) GS:ffff951bafc80000(0000) knlGS:0000000000000000 > [ 63.368160] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 63.373893] CR2: 0000000000000004 CR3: 0000000128d8a000 CR4: 00000000000007e0 > [ 63.381029] Call Trace: > [ 63.383541] [<ffffffffc093e7f7>] xfs_bmbt_change_owner+0x27/0x70 [xfs] > [ 63.390225] [<ffffffffc09941a3>] xfs_recover_inode_owner_change.isra.26+0xa3/0xc0 [xfs] > [ 63.398351] [<ffffffffc0995b9c>] xlog_recover_inode_pass2+0x4cc/0x9d0 [xfs] > [ 63.405460] [<ffffffffc0990018>] ? xfs_efi_release+0x58/0x80 [xfs] > [ 63.411763] [<ffffffffc0996192>] xlog_recover_commit_pass2+0xf2/0x1a0 [xfs] > [ 63.418848] [<ffffffffc0996289>] xlog_recover_items_pass2+0x49/0x70 [xfs] > [ 63.425769] [<ffffffffc09964c5>] xlog_recover_commit_trans+0x215/0x250 [xfs] > ... > > I can reproduce this with a hacky script that can be cleaned up and turned into > an xfstest: > > #!/bin/bash > > DEV=/dev/sdZ1 > > umount $DEV > mkfs.xfs -f $DEV > > mkdir -p /mnt/test > mount $DEV /mnt/test > > for I in `seq 4194304 -4096 0`; do > ((I % 12288)) || continue > xfs_io -f -d -c "pwrite $I 4096" /mnt/test/fragfile > done > > xfs_fsr -v /mnt/test/fragfile > > > xfs_io -c "truncate 0" /mnt/test/fragfile > > for I in `seq 32768 -4096 0`; do > xfs_io -f -c "pwrite $I 4096" /mnt/test/fragfile > done > > sync > > xfs_io -x -c shutdown /mnt/test > umount /mnt/test > mount $DEV /mnt/test ...which I assume is going to show up on fstests shortly, right? :) --D > ==== > > The upshot is that if we successfully fsr a file and it remains in btree format, > then it changes to non-btree format and we crash, log replay will still try to > do xfs_bmbt_change_owner which will oops if we're not in btree format. > > Patches to fix this follow. Lightly tested. Thanks to dave for talking > through this one with me.