On Thu, Apr 05, 2012 at 12:28:35PM -0700, Eric Sandeen wrote: > We encountered a bug after log replay failed on mount: > > [14217.617258] XFS (md5): log mount/recovery failed: error 5 > [14217.624037] XFS (md5): log mount failed > [14234.866732] general protection fault: 0000 [#1] SMP > ... > [14234.913286] RIP: 0010:[<ffffffff810c2cbe>] [<ffffffff810c2cbe>] > __lock_acquire+0x2be/0xa20 > [14234.913293] RSP: 0018:ffff880116c6fa38 EFLAGS: 00010002 > [14234.913294] RAX: 6b6b6b6b6b6b6b6b RBX: 0000000000000000 RCX: 0000000000000000 > [14234.913321] Call Trace: > [14234.913335] [<ffffffff810c3aad>] lock_acquire+0x9d/0x1f0 > [14234.913365] [<ffffffff8162f646>] _raw_spin_lock+0x46/0x80 > [14234.913372] [<ffffffff8130e60d>] _atomic_dec_and_lock+0x4d/0x70 > [14234.913382] [<ffffffffa0392801>] xfs_buf_rele+0x51/0x1e0 [xfs] > [14234.913400] [<ffffffffa039554f>] xfs_flush_buftarg+0x10f/0x130 [xfs] > [14234.913410] [<ffffffffa03955a4>] xfs_free_buftarg+0x34/0x70 [xfs] > [14234.913422] [<ffffffffa03a4634>] xfs_close_devices+0x64/0x70 [xfs] > [14234.913433] [<ffffffffa03a5890>] xfs_fs_fill_super+0x180/0x2a0 [xfs] > [14234.913436] [<ffffffff811b7a71>] mount_bdev+0x1d1/0x220 > [14234.913460] [<ffffffffa03a3ad5>] xfs_fs_mount+0x15/0x20 [xfs] > [14234.913462] [<ffffffff811b8603>] mount_fs+0x43/0x1b0 > [14234.913468] [<ffffffff811d473a>] vfs_kern_mount+0x6a/0xd0 > [14234.913471] [<ffffffff811d6124>] do_kern_mount+0x54/0x110 > [14234.913473] [<ffffffff811d7c34>] do_mount+0x1a4/0x260 > [14234.913476] [<ffffffff811d8100>] sys_mount+0x90/0xe0 > [14234.913478] [<ffffffff81639202>] system_call_fastpath+0x16/0x1b > > > RAX is freed memory. > > After xfs_mountfs() fails to replay the log, it > goes to out_perag_free: which does xfs_free_perag(). > > Later down the mount failure path, xfs_buf_rele() checks for (!pag): > > if (!pag) { > ASSERT(list_empty(&bp->b_lru)); > ASSERT(RB_EMPTY_NODE(&bp->b_rbnode)); > if (atomic_dec_and_test(&bp->b_hold)) > xfs_buf_free(bp); > return; > } > > but we did not set the perag to NULL, so this doesn't get hit. Sure, bp->b_pag is only set for cached buffers, but it also has a reference to the perag structure so it is supposed to be guaranteed that the perag structure is valid until the buffer is freed and the perag reference is dropped. > Next we do: > > if (atomic_dec_and_lock(&bp->b_hold, &pag->pag_buf_lock)) { > > which goes bang if the perag has been freed. Which means that the unmount has failed to flush all the active buffers. This woul dhave resulted in an ASSERT() firing in xfs_free_perag() on a debug kernel.... > > Signed-off-by: Eric Sandeen <sandeen@xxxxxxxxxx> > --- > > Full disclosure: not yet tested, will do soon :) > > diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c > index 1ffead4..ad830f7 100644 > --- a/fs/xfs/xfs_mount.c > +++ b/fs/xfs/xfs_mount.c > @@ -252,6 +252,7 @@ __xfs_free_perag( > > ASSERT(atomic_read(&pag->pag_ref) == 0); .... right here, or even even before call_rcu() was called > kmem_free(pag); > + pag = NULL; > } As it is, I don't quite understand what setting a local variable to NULL is supposed to acheive here - the perag structure has been removed from the radix tree and zeroing a local variable doesn't do anything globally visible... What is more likely is that the log mount failure path is not flushing the buftarg correctly before tearing down the perag structures. Indeed, a log mount failure (and a perag data intialisation failure) will simply tear down the perag structures without waiting for cached buffers to be flushed in xfs_mountfs(), so that is going to the root cause of this problem. i.e. we need xfs_flush_buftarg() calls in the mountfs error handling path.... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs