XFS / xfssyncd lock-ups on 2.6.38-8

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



We're running a dozen Amazon AWS instances (on Ubuntu Natty Narwhal, kernel 2.6.38-8). We've recent brought up several machines based on some previous snapshots (EBS snapshots, rather than LVM), and they've been locking up under load. The dmesg output is below; does this issue look familiar, or perhaps fixed in a later kernel? Or could it be indicative of some data corruption in the snapshot process? 

The drive is being used for Postgres write-ahead-logs, so it's a write-heavy, read-light drive. When the array (4 drives in RAID0) freezes up, nothing seems to fix it short of a hard restart of the machine—we've tried things like stopping Postgres, issuing a 'drop cache' to the kernel, and trying to kill the locked process, to no avail.

Would appreciate any thoughts/pointers to fixes or workarounds if this is a known issue.

Thanks,
Mike

====

(The /dev/md126 array that is locking up is an XFS RAID0 across 4 volumes)

The errors look like this:

[558307.361854] INFO: task xfssyncd/md126:1029 blocked for more than 120 seconds.
[558307.361867] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[558307.361874] xfssyncd/md126  D ffff881116f13b00     0  1029      2 0x00000000
[558307.361879]  ffff881088989d00 0000000000000246 ffff881088989fd8 ffff881088988000
[558307.361884]  0000000000013b00 ffff8810866c3120 ffff881088989fd8 0000000000013b00
[558307.361889]  ffff881089b84440 ffff8810866c2d80 ffffffff815dc13e ffff88108916e400
[558307.361894] Call Trace:
[558307.361904]  [<ffffffff815dc13e>] ? _raw_spin_unlock_irqrestore+0x1e/0x30
[558307.361932]  [<ffffffffa00e52d8>] xlog_grant_log_space+0x4a8/0x500 [xfs]
[558307.361937]  [<ffffffff8105f180>] ? default_wake_function+0x0/0x20
[558307.361951]  [<ffffffffa00e71ff>] xfs_log_reserve+0xff/0x140 [xfs]
[558307.361967]  [<ffffffffa00f31fc>] xfs_trans_reserve+0x9c/0x200 [xfs]
[558307.361980]  [<ffffffffa00d7383>] xfs_fs_log_dummy+0x43/0x90 [xfs]
[558307.361995]  [<ffffffffa010a3c1>] xfs_sync_worker+0x81/0x90 [xfs]
[558307.362009]  [<ffffffffa01090f3>] xfssyncd+0x183/0x230 [xfs]
[558307.362025]  [<ffffffffa0108f70>] ? xfssyncd+0x0/0x230 [xfs]
[558307.362030]  [<ffffffff81086ac6>] kthread+0x96/0xa0
[558307.362035]  [<ffffffff8100cde4>] kernel_thread_helper+0x4/0x10
[558307.362038]  [<ffffffff8100c1e3>] ? int_ret_from_sys_call+0x7/0x1b
[558307.362041]  [<ffffffff815dc621>] ? retint_restore_args+0x5/0x6
[558307.362045]  [<ffffffff8100cde0>] ? kernel_thread_helper+0x0/0x10


_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs



[Index of Archives]     [Linux XFS Devel]     [Linux Filesystem Development]     [Filesystem Testing]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux