I have a really wierd problem with storage. The one thing that I *can* see is dmesg output giving messages about hung tasks and XFS call tracebacks, so I'm posting here in the hope that someone can help me interpret them - even if only to say "this definitely means that your disks are not responding" would be very helpful. First a bit of background. there are three storage servers: they have 24 x 3TB SATA disks on LSI HBAs (1x16 port, 1x8 port), plus a boot disk directly to the motherboard, and 8GB RAM. They are running Ubuntu 11.10 server amd64, kernel 3.0.0-16-server #29-Ubuntu SMP. Apart from the boot disk, all filesystems are XFS mounted with inode64. I had been testing them as gluster fileservers, and two of them locked up in operation. This gave the appearance of a hardware or BIOS fault - in particular, a power-cycle at the front panel wouldn't bring them back to life, but powering off using the physical switch on the PSU did. So after upgrading the BIOS and iKVM firmware on the motherboards, I decided to give them a soak test, running a couple of instances on bonnie++ talking directly to the disks (so no gluster or network I/O involved). When I came back to check on them, they have all locked up in similar ways. On the two that I can still login to, I see the XFS call traces in dmesg. The three servers have their disks configured slightly differently, and the failure modes are slightly different, so I'll expand them here. storage1 -------- Configured as two RAID6 groups of 12 disks each: $ cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md127 : active raid6 sdy[11] sdx[10] sdw[9] sdv[8] sdu[7] sdt[6] sds[5] sdr[4] sdq[3] sdp[2] sdo[1] sdn[0] 29302650880 blocks super 1.2 level 6, 1024k chunk, algorithm 2 [12/12] [UUUUUUUUUUUU] md126 : active raid6 sdj[8] sdk[9] sdd[2] sde[3] sdi[7] sdm[11] sdg[5] sdc[1] sdb[0] sdl[10] sdh[6] sdf[4] 29302650880 blocks super 1.2 level 6, 1024k chunk, algorithm 2 [12/12] [UUUUUUUUUUUU] unused devices: <none> $ df -h | grep md /dev/md126 28T 105G 28T 1% /disk/data1 /dev/md127 28T 86G 28T 1% /disk/data2 My soak test was to run four instances of bonnie++, on /disk/data1/{test,testb} and /disk/data2/{test,testb}, using variants of this command line: while [ 1 ]; do bonnie++ -d /disk/data1/test -s 16384k -n 98:800k:500k:1000; done SYMPTOM: I can happily ssh into this box, but iostat shows that no disk activity is going on. The bonnie++ processes (which I was running in screen sessions) have apparently hung, not having completed even a single cycle: brian@storage1:~$ while [ 1 ]; do bonnie++ -d /disk/data1/test -s 16384k -n 98:800k:500k:1000; done Writing a byte at a time...done Writing intelligently...done Rewriting...done Reading a byte at a time...done Reading intelligently...done start 'em...done...done...done...done...done... Create files in sequential order...done. Stat files in sequential order...done. Delete files in sequential order...done. Create files in random order... Three of the bonnie++ processes stopped at this point. The other stopped at "Delete files in sequential order..." dmesg is apparently silent until I start getting the call traces, here's how they begin: ... [48899.342127] disk 9, o:1, dev:sdw [48899.342129] disk 10, o:1, dev:sdx [48899.342131] disk 11, o:1, dev:sdy[532203.415122] INFO: task bonnie++:28294 blocked for more than 120 seconds. [532203.415489] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [532203.424964] bonnie++ D ffffffff81805120 0 28294 28242 0x00000000 [532203.424973] ffff880212129cc8 0000000000000082 ffff880212129c68 ffffffff812ed3ee [532203.424984] ffff880212129fd8 ffff880212129fd8 ffff880212129fd8 0000000000012a40 [532203.424993] ffffffff81c0b020 ffff88020f070000 ffff88021fc0de40 7fffffffffffffff [532203.425003] Call Trace: [532203.425016] [<ffffffff812ed3ee>] ? radix_tree_lookup_slot+0xe/0x10 [532203.425025] [<ffffffff816058cf>] schedule+0x3f/0x60 [532203.425031] [<ffffffff81605f15>] schedule_timeout+0x2a5/0x320 [532203.425039] [<ffffffff8102c3d3>] ? x2apic_send_IPI_mask+0x13/0x20 [532203.425046] [<ffffffff81026558>] ? native_smp_send_reschedule+0x48/0x60 [532203.425054] [<ffffffff8104a5d2>] ? ttwu_queue+0x92/0xd0 [532203.425060] [<ffffffff81604f7f>] wait_for_common+0xdf/0x180 [532203.425068] [<ffffffff81194610>] ? __sync_filesystem+0x90/0x90 [532203.425074] [<ffffffff810574b0>] ? try_to_wake_up+0x200/0x200 [532203.425080] [<ffffffff81194610>] ? __sync_filesystem+0x90/0x90 [532203.425087] [<ffffffff816050fd>] wait_for_completion+0x1d/0x20 [532203.425093] [<ffffffff8118e969>] writeback_inodes_sb_nr+0x79/0xa0 [532203.425099] [<ffffffff8118ebc1>] writeback_inodes_sb+0x21/0x30 [532203.425104] [<ffffffff811945c9>] __sync_filesystem+0x49/0x90 [532203.425110] [<ffffffff8119462f>] sync_one_sb+0x1f/0x30 [532203.425117] [<ffffffff8116b525>] iterate_supers+0xa5/0x100 [532203.425123] [<ffffffff811946db>] sys_sync+0x2b/0x60 [532203.425131] [<ffffffff8160fb82>] system_call_fastpath+0x16/0x1b [532323.396630] INFO: task bonnie++:28239 blocked for more than 120 seconds. [532323.401489] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [532323.411026] bonnie++ D ffffffff81805120 0 28239 26958 0x00000000 [532323.411033] ffff88020fd85cc8 0000000000000086 ffff88020fd85c68 ffffffff812ed3ee [532323.411038] ffff88020fd85fd8 ffff88020fd85fd8 ffff88020fd85fd8 0000000000012a40 [532323.411042] ffff880212d2dc80 ffff880207400000 ffff88021fd0de40 7fffffffffffffff [532323.411050] Call Trace: [532323.411062] [<ffffffff812ed3ee>] ? radix_tree_lookup_slot+0xe/0x10 [532323.411072] [<ffffffff816058cf>] schedule+0x3f/0x60 [532323.411079] [<ffffffff81605f15>] schedule_timeout+0x2a5/0x320 [532323.411087] [<ffffffff8102c3d3>] ? x2apic_send_IPI_mask+0x13/0x20 [532323.411094] [<ffffffff81026558>] ? native_smp_send_reschedule+0x48/0x60 [532323.411102] [<ffffffff8104a5d2>] ? ttwu_queue+0x92/0xd0 [532323.411109] [<ffffffff81604f7f>] wait_for_common+0xdf/0x180 [532323.411117] [<ffffffff81194610>] ? __sync_filesystem+0x90/0x90 [532323.411123] [<ffffffff810574b0>] ? try_to_wake_up+0x200/0x200 [532323.411129] [<ffffffff81194610>] ? __sync_filesystem+0x90/0x90 [532323.411136] [<ffffffff816050fd>] wait_for_completion+0x1d/0x20 [532323.411142] [<ffffffff8118e969>] writeback_inodes_sb_nr+0x79/0xa0 [532323.411149] [<ffffffff8118ebc1>] writeback_inodes_sb+0x21/0x30 [532323.411155] [<ffffffff811945c9>] __sync_filesystem+0x49/0x90 [532323.411166] [<ffffffff8119462f>] sync_one_sb+0x1f/0x30 [532323.411170] [<ffffffff8116b525>] iterate_supers+0xa5/0x100 [532323.411173] [<ffffffff811946db>] sys_sync+0x2b/0x60 [532323.411177] [<ffffffff8160fb82>] system_call_fastpath+0x16/0x1b [535202.773094] INFO: task kswapd0:31 blocked for more than 120 seconds. [535202.778172] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [535202.788110] kswapd0 D 0000000000000003 0 31 2 0x00000000 [535202.788116] ffff880212f03980 0000000000000046 ffff8800891939b0 000000000000001a [535202.788121] ffff880212f03fd8 ffff880212f03fd8 ffff880212f03fd8 0000000000012a40 [535202.788126] ffff880211a0ae40 ffff88020fbfae40 ffff880212f03970 ffff880208215818 [535202.788136] Call Trace: [535202.788148] [<ffffffff816058cf>] schedule+0x3f/0x60 [535202.788178] [<ffffffffa01c2861>] _xfs_log_force_lsn+0x171/0x2e0 [xfs] [535202.788186] [<ffffffff810574b0>] ? try_to_wake_up+0x200/0x200 [535202.788210] [<ffffffffa01b8945>] ? xfs_iunpin_wait+0x35/0xb0 [xfs] [535202.788233] [<ffffffffa01c29e8>] xfs_log_force_lsn+0x18/0x40 [xfs] [535202.788252] [<ffffffffa01b68a7>] xfs_iunpin_nowait+0x47/0xb0 [xfs] [535202.788261] [<ffffffffa01b8945>] xfs_iunpin_wait+0x35/0xb0 [xfs] [535202.788265] [<ffffffff8104652b>] ? try_wait_for_completion+0x4b/0x60 [535202.788277] [<ffffffffa01e4352>] xfs_reclaim_inode+0xb2/0x2a0 [xfs] [535202.788281] [<ffffffff812ee45c>] ? radix_tree_gang_lookup_tag.part.7+0x6c/0xf0 [535202.788293] [<ffffffffa01e4fec>] xfs_reclaim_inodes_ag+0x20c/0x330 [xfs] [535202.788304] [<ffffffffa01e51b3>] xfs_reclaim_inode_shrink+0xa3/0xc0 [xfs] [535202.788309] [<ffffffff8111b1dd>] shrink_slab+0x11d/0x190 [535202.788312] [<ffffffff8111e42a>] balance_pgdat+0x52a/0x6f0 [535202.788315] [<ffffffff8111e6da>] kswapd+0xea/0x1f0 [535202.788318] [<ffffffff8111e5f0>] ? balance_pgdat+0x6f0/0x6f0 [535202.788321] [<ffffffff81080dac>] kthread+0x8c/0xa0 [535202.788325] [<ffffffff81610ca4>] kernel_thread_helper+0x4/0x10 [535202.788328] [<ffffffff81080d20>] ? flush_kthread_worker+0xa0/0xa0 [535202.788331] [<ffffffff81610ca0>] ? gs_change+0x13/0x13 [535202.788343] INFO: task bonnie++:28609 blocked for more than 120 seconds. [535202.793856] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [535202.805501] bonnie++ D ffffffff81805120 0 28609 28557 0x00000000 [535202.805506] ffff8800a95993d8 0000000000000086 ffff8802129c8000 0000000000000002 [535202.805510] ffff8800a9599fd8 ffff8800a9599fd8 ffff8800a9599fd8 0000000000012a40 [535202.805515] ffffffff81c0b020 ffff8802129c8000 ffff8800a95993c8 000000000000038d [535202.805519] Call Trace: [535202.805527] [<ffffffff816058cf>] schedule+0x3f/0x60 [535202.805552] [<ffffffffa01c40f7>] xlog_cil_force_lsn+0xf7/0x120 [xfs] [535202.805567] [<ffffffff810574b0>] ? try_to_wake_up+0x200/0x200 [535202.805585] [<ffffffffa01b8945>] ? xfs_iunpin_wait+0x35/0xb0 [xfs] [535202.805602] [<ffffffffa01c2747>] _xfs_log_force_lsn+0x57/0x2e0 [xfs] [535202.805609] [<ffffffff8160772e>] ? _raw_spin_lock+0xe/0x20 [535202.805615] [<ffffffff811510e8>] ? add_partial+0x58/0x90 [535202.805632] [<ffffffffa01b8945>] ? xfs_iunpin_wait+0x35/0xb0 [xfs] [535202.805648] [<ffffffffa01c29e8>] xfs_log_force_lsn+0x18/0x40 [xfs] [535202.805665] [<ffffffffa01b68a7>] xfs_iunpin_nowait+0x47/0xb0 [xfs] [535202.805681] [<ffffffffa01b8945>] xfs_iunpin_wait+0x35/0xb0 [xfs] [535202.805687] [<ffffffff8104652b>] ? try_wait_for_completion+0x4b/0x60 [535202.805707] [<ffffffffa01e4352>] xfs_reclaim_inode+0xb2/0x2a0 [xfs] [535202.805713] [<ffffffff812ee45c>] ? radix_tree_gang_lookup_tag.part.7+0x6c/0xf0 [535202.805732] [<ffffffffa01e4fec>] xfs_reclaim_inodes_ag+0x20c/0x330 [xfs] [535202.805751] [<ffffffffa01e51b3>] xfs_reclaim_inode_shrink+0xa3/0xc0 [xfs] [535202.805758] [<ffffffff8111b1dd>] shrink_slab+0x11d/0x190 [535202.805763] [<ffffffff8111d978>] do_try_to_free_pages+0x288/0x2e0 [535202.805782] [<ffffffffa01da59f>] ? xfs_buf_get+0x8f/0x1a0 [xfs] [535202.805787] [<ffffffff8111dcd4>] try_to_free_pages+0xb4/0x180 [535202.805793] [<ffffffff81113260>] __alloc_pages_nodemask+0x4e0/0x800 [535202.805799] [<ffffffff81149273>] alloc_pages_current+0xa3/0x110 [535202.805805] [<ffffffff81151f5d>] allocate_slab+0xfd/0x160 [535202.805810] [<ffffffff81151ff0>] new_slab+0x30/0x160 [535202.805816] [<ffffffff815f5299>] __slab_alloc+0x127/0x222 [535202.805822] [<ffffffff8117d447>] ? d_alloc+0x27/0x1d0 [535202.805827] [<ffffffff811534d2>] kmem_cache_alloc+0x112/0x120 [535202.805832] [<ffffffff8117d447>] d_alloc+0x27/0x1d0 [535202.805838] [<ffffffff81172c4d>] d_alloc_and_lookup+0x2d/0x90 [535202.805844] [<ffffffff81180485>] ? d_lookup+0x35/0x60 [535202.805850] [<ffffffff81173946>] __lookup_hash.part.15+0x96/0xb0 [535202.805869] [<ffffffffa01ce173>] ? xfs_trans_free_item_desc+0x33/0x40 [xfs] [535202.805874] [<ffffffff81174230>] lookup_hash+0x80/0xe0 [535202.805878] [<ffffffff8117625b>] do_last+0x26b/0x730 [535202.805883] [<ffffffff811778aa>] path_openat+0xca/0x3f0 [535202.805901] [<ffffffffa01d2630>] ? xfs_free_eofblocks+0x1d0/0x240 [xfs] [535202.805919] [<ffffffffa01d2630>] ? xfs_free_eofblocks+0x1d0/0x240 [xfs] [535202.805925] [<ffffffff81177cd2>] do_filp_open+0x42/0xa0 [535202.805931] [<ffffffff812f4cc1>] ? strncpy_from_user+0x31/0x40 [535202.805936] [<ffffffff8117307a>] ? do_getname+0x10a/0x180 [535202.805942] [<ffffffff8160772e>] ? _raw_spin_lock+0xe/0x20 [535202.805947] [<ffffffff81184f87>] ? alloc_fd+0xf7/0x150 [535202.805953] [<ffffffff8116795d>] do_sys_open+0xed/0x220 [535202.805958] [<ffffffff81169a65>] ? fput+0x25/0x30 [535202.805964] [<ffffffff81167ab0>] sys_open+0x20/0x30 [535202.805970] [<ffffffff8160fb82>] system_call_fastpath+0x16/0x1b [535202.805975] INFO: task bonnie++:28664 blocked for more than 120 seconds. ... snip rest, I have more if you want it ... storage2 -------- This originally had all 24 disks in a software RAID0 array. Two of them (sdr, sdy) started giving read errors, so I built a new 22-disk RAID0 without those two. Then I ran two instances of bonnie++, on /disk/scratch/test and /disk/scratch/testb This machine has locked up in such a way that I cannot ssh to it, it just drops the connection: $ telnet storage2 22 Trying storage2... Connected to 192.168.5.71. Escape character is '^]'. Connection closed by foreign host. At the console I see a number of messages about blocked tasks (kworker, bonnie++, sshd, cron, mv, kswapd0). There is a login prompt, but if I type my username I get no password prompt. ... [520326.168622] INFO: task kswapd0:30 blocked for more than 120 seconds. [520326.177738] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [520326.197088] INFO: task kworder/3:2:4834 blocked for more than 120 seconds. [520326.207151] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Ubuntu 11.10 storage2 tty1 storage2 login: brian << no response past this point >> storage3 -------- This has 24 disks in a software RAID0 array, and I was running two instances of bonnie++ against it. $ cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md127 : active raid0 sdn[12] sdu[18] sdx[22] sdl[9] sdy[23] sdd[2] sdt[17] sdi[7] sdj[8] sdr[19] sdg[5] sdw[21] sdf[4] sdv[20] sds[16] sdk[10] sde[3] sdh[6] sdp[14] sdo[13] sdc[1] sdb[0] sdq[15] sdm[11] 70326362112 blocks super 1.2 1024k chunks unused devices: <none> SYMPTOM: Like storage1, I can ssh in but I find no I/O activity going on. I also see call tracebacks in dmesg, although this time the first one is kworker not bonnie++ ... [ 38.659687] eth3: no IPv6 routers present [ 41.738989] eth2: no IPv6 routers present [531843.502509] INFO: task kworker/2:1:29 blocked for more than 120 seconds. [531843.502876] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [531843.512298] kworker/2:1 D 0000000000000002 0 29 2 0x00000000 [531843.512304] ffff88020f493b80 0000000000000046 ffff880017213af8 ffff8800c93384c8 [531843.512309] ffff88020f493fd8 ffff88020f493fd8 ffff88020f493fd8 0000000000012a40 [531843.512313] ffff88020d80ae40 ffff88020f498000 0000000000000000 ffff88020e9c11e8 [531843.512318] Call Trace: [531843.512327] [<ffffffff816058cf>] schedule+0x3f/0x60 [531843.512332] [<ffffffff816066c7>] __mutex_lock_slowpath+0xd7/0x150 [531843.512338] [<ffffffff81606282>] mutex_lock+0x22/0x40 [531843.512371] [<ffffffffa020806d>] xfs_reclaim_inodes_ag+0x28d/0x330 [xfs] [531843.512395] [<ffffffffa02081f0>] ? xfs_reclaim_inodes+0x20/0x20 [xfs] [531843.512416] [<ffffffffa02081ed>] xfs_reclaim_inodes+0x1d/0x20 [xfs] [531843.512436] [<ffffffffa0208212>] xfs_reclaim_worker+0x22/0x40 [xfs] [531843.512443] [<ffffffff8107bbba>] process_one_work+0x11a/0x480 [531843.512449] [<ffffffff8107c965>] worker_thread+0x165/0x370 [531843.512455] [<ffffffff8107c800>] ? manage_workers.isra.30+0x130/0x130 [531843.512461] [<ffffffff81080dac>] kthread+0x8c/0xa0 [531843.512468] [<ffffffff81610ca4>] kernel_thread_helper+0x4/0x10 [531843.512475] [<ffffffff81080d20>] ? flush_kthread_worker+0xa0/0xa0 [531843.512481] [<ffffffff81610ca0>] ? gs_change+0x13/0x13 [531843.512485] INFO: task kswapd0:31 blocked for more than 120 seconds. [531843.517282] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [531843.526697] kswapd0 D ffffffff81805120 0 31 2 0x00000000 [531843.526701] ffff880212f018f0 0000000000000046 0000000000000000 0000000000000000 [531843.526706] ffff880212f01fd8 ffff880212f01fd8 ffff880212f01fd8 0000000000012a40 [531843.526710] ffff880212d2dc80 ffff88020f49ae40 ffff880212f018e0 0000000000004978 [531843.526714] Call Trace: [531843.526719] [<ffffffff816058cf>] schedule+0x3f/0x60 [531843.526740] [<ffffffffa01e70f7>] xlog_cil_force_lsn+0xf7/0x120 [xfs] [531843.526744] [<ffffffff810574b0>] ? try_to_wake_up+0x200/0x200 [531843.526764] [<ffffffffa01db945>] ? xfs_iunpin_wait+0x35/0xb0 [xfs] [531843.526783] [<ffffffffa01e5747>] _xfs_log_force_lsn+0x57/0x2e0 [xfs] [531843.526802] [<ffffffffa01db945>] ? xfs_iunpin_wait+0x35/0xb0 [xfs] [531843.526819] [<ffffffffa01e59e8>] xfs_log_force_lsn+0x18/0x40 [xfs] [531843.526837] [<ffffffffa01d98a7>] xfs_iunpin_nowait+0x47/0xb0 [xfs] [531843.526854] [<ffffffffa01db945>] xfs_iunpin_wait+0x35/0xb0 [xfs] [531843.526858] [<ffffffff8104652b>] ? try_wait_for_completion+0x4b/0x60 [531843.526879] [<ffffffffa0207352>] xfs_reclaim_inode+0xb2/0x2a0 [xfs] [531843.526884] [<ffffffff812ee45c>] ? radix_tree_gang_lookup_tag.part.7+0x6c/0xf0 [531843.526904] [<ffffffffa0207fec>] xfs_reclaim_inodes_ag+0x20c/0x330 [xfs] [531843.526910] [<ffffffff8160772e>] ? _raw_spin_lock+0xe/0x20 [531843.526928] [<ffffffffa02081b3>] xfs_reclaim_inode_shrink+0xa3/0xc0 [xfs] [531843.526934] [<ffffffff8111b1dd>] shrink_slab+0x11d/0x190 [531843.526937] [<ffffffff8111e42a>] balance_pgdat+0x52a/0x6f0 [531843.526941] [<ffffffff8111e6da>] kswapd+0xea/0x1f0 [531843.526944] [<ffffffff8111e5f0>] ? balance_pgdat+0x6f0/0x6f0 [531843.526948] [<ffffffff81080dac>] kthread+0x8c/0xa0 [531843.526951] [<ffffffff81610ca4>] kernel_thread_helper+0x4/0x10 [531843.526956] [<ffffffff81080d20>] ? flush_kthread_worker+0xa0/0xa0 [531843.526959] [<ffffffff81610ca0>] ? gs_change+0x13/0x13 [531843.526963] INFO: task kworker/3:1:52 blocked for more than 120 seconds. [531843.532003] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [531843.541890] kworker/3:1 D 0000000000000003 0 52 2 0x00000000 [531843.541894] ffff88020f0b9450 0000000000000046 0000000000000000 0000000000000000 [531843.541899] ffff88020f0b9fd8 ffff88020f0b9fd8 ffff88020f0b9fd8 0000000000012a40 [531843.541903] ffff88020d808000 ffff88020f07ae40 ffff88020f0b9440 0000000000004978 [531843.541907] Call Trace: [531843.541912] [<ffffffff816058cf>] schedule+0x3f/0x60 [531843.541931] [<ffffffffa01e70f7>] xlog_cil_force_lsn+0xf7/0x120 [xfs] [531843.541938] [<ffffffff810574b0>] ? try_to_wake_up+0x200/0x200 [531843.541959] [<ffffffffa01db945>] ? xfs_iunpin_wait+0x35/0xb0 [xfs] [531843.541979] [<ffffffffa01e5747>] _xfs_log_force_lsn+0x57/0x2e0 [xfs] [531843.541999] [<ffffffffa01bbefc>] ? xfs_bmapi+0x2ac/0x1150 [xfs] [531843.542019] [<ffffffffa01db945>] ? xfs_iunpin_wait+0x35/0xb0 [xfs] [531843.542039] [<ffffffffa01e59e8>] xfs_log_force_lsn+0x18/0x40 [xfs] [531843.542059] [<ffffffffa01d98a7>] xfs_iunpin_nowait+0x47/0xb0 [xfs] [531843.542078] [<ffffffffa01db945>] xfs_iunpin_wait+0x35/0xb0 [xfs] [531843.542084] [<ffffffff8104652b>] ? try_wait_for_completion+0x4b/0x60 [531843.542107] [<ffffffffa0207352>] xfs_reclaim_inode+0xb2/0x2a0 [xfs] [531843.542114] [<ffffffff812ee45c>] ? radix_tree_gang_lookup_tag.part.7+0x6c/0xf0 [531843.542136] [<ffffffffa0207fec>] xfs_reclaim_inodes_ag+0x20c/0x330 [xfs] [531843.542158] [<ffffffffa02081b3>] xfs_reclaim_inode_shrink+0xa3/0xc0 [xfs] [531843.542166] [<ffffffff8111b1dd>] shrink_slab+0x11d/0x190 [531843.542171] [<ffffffff8111d978>] do_try_to_free_pages+0x288/0x2e0 [531843.542178] [<ffffffff8111dcd4>] try_to_free_pages+0xb4/0x180 [531843.542184] [<ffffffff81113260>] __alloc_pages_nodemask+0x4e0/0x800 [531843.542192] [<ffffffff8160772e>] ? _raw_spin_lock+0xe/0x20 [531843.542199] [<ffffffff81149273>] alloc_pages_current+0xa3/0x110 [531843.542205] [<ffffffff81151f5d>] allocate_slab+0xfd/0x160 [531843.542211] [<ffffffff81151ff0>] new_slab+0x30/0x160 [531843.542218] [<ffffffff815f5299>] __slab_alloc+0x127/0x222 [531843.542225] [<ffffffff8110c545>] ? mempool_alloc_slab+0x15/0x20 [531843.542232] [<ffffffff811534d2>] kmem_cache_alloc+0x112/0x120 [531843.542238] [<ffffffff8110c545>] mempool_alloc_slab+0x15/0x20 [531843.542245] [<ffffffff8110c863>] mempool_alloc+0x53/0x130 [531843.542251] [<ffffffff81081850>] ? add_wait_queue+0x60/0x60 [531843.542257] [<ffffffff8119bfbe>] bio_alloc_bioset+0x3e/0xf0 [531843.542263] [<ffffffff814a9350>] ? slot_store+0x2e0/0x2e0 [531843.542271] [<ffffffff814a51f7>] bio_alloc_mddev+0x27/0x50 [531843.542276] [<ffffffff814a93f8>] submit_flushes+0xa8/0x140[531843.542282] [<ffffffff8107bbba>] process_one_work+0x11a/0x480 [531843.542287] [<ffffffff8107c965>] worker_thread+0x165/0x370 [531843.542293] [<ffffffff8107c800>] ? manage_workers.isra.30+0x130/0x130 [531843.542299] [<ffffffff81080dac>] kthread+0x8c/0xa0 [531843.542305] [<ffffffff81610ca4>] kernel_thread_helper+0x4/0x10 [531843.542312] [<ffffffff81080d20>] ? flush_kthread_worker+0xa0/0xa0 [531843.542318] [<ffffffff81610ca0>] ? gs_change+0x13/0x13 [531843.542325] INFO: task xfsaild/md127:1146 blocked for more than 120 seconds. [531843.547840] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [531843.559458] xfsaild/md127 D 0000000000000003 0 1146 2 0x00000000 [531843.559461] ffff8802119bb700 0000000000000046 ffff8802119bb6f0 ffffffff81079c25 [531843.559463] ffff8802119bbfd8 ffff8802119bbfd8 ffff8802119bbfd8 0000000000012a40 [531843.559465] ffff88020f07ae40 ffff8802129d4560 ffff8802119bb6f0 ffff8802129d4560 [531843.559467] Call Trace: [531843.559469] [<ffffffff81079c25>] ? __queue_work+0xe5/0x320 [531843.559471] [<ffffffff816058cf>] schedule+0x3f/0x60 [531843.559473] [<ffffffff814a9606>] md_flush_request+0x86/0x140 [531843.559475] [<ffffffff810574b0>] ? try_to_wake_up+0x200/0x200 [531843.559481] [<ffffffffa0010419>] raid0_make_request+0x119/0x1e0 [raid0] [531843.559484] [<ffffffff812df850>] ? blk_throtl_bio+0x1c0/0x510 [531843.559486] [<ffffffff814a4886>] md_make_request+0xc6/0x200 [531843.559488] [<ffffffff8110c545>] ? mempool_alloc_slab+0x15/0x20 [531843.559490] [<ffffffff812ce48a>] generic_make_request.part.51+0x24a/0x510 [531843.559492] [<ffffffff812ce795>] generic_make_request+0x45/0x60 [531843.559494] [<ffffffff812ce837>] submit_bio+0x87/0x110 [531843.559504] [<ffffffffa01fda82>] _xfs_buf_ioapply+0x162/0x1e0 [xfs] [531843.559514] [<ffffffffa01e1dea>] ? xlog_bdstrat+0x2a/0x60 [xfs] [531843.559525] [<ffffffffa01fdc5e>] xfs_buf_iorequest+0x5e/0x100 [xfs] [531843.559535] [<ffffffffa01e1dea>] xlog_bdstrat+0x2a/0x60 [xfs] [531843.559545] [<ffffffffa01e3249>] xlog_sync+0x1c9/0x3c0 [xfs] [531843.559554] [<ffffffffa01e48a5>] xlog_state_release_iclog+0xa5/0xd0 [xfs] [531843.559563] [<ffffffffa01e4cf7>] xlog_write_copy_finish.isra.12+0x147/0x170 [xfs] [531843.559573] [<ffffffffa01e50f8>] xlog_write+0x268/0x400 [xfs] [531843.559583] [<ffffffffa01f9667>] ? kmem_zone_alloc+0x67/0xe0 [xfs] [531843.559593] [<ffffffffa01e67ec>] xlog_cil_push+0x20c/0x380 [xfs] [531843.559603] [<ffffffffa01e710e>] xlog_cil_force_lsn+0x10e/0x120 [xfs] [531843.559606] [<ffffffff81086bb6>] ? down_trylock+0x36/0x50 [531843.559617] [<ffffffffa01fc881>] ? xfs_buf_rele+0x51/0x130 [xfs] [531843.559627] [<ffffffffa01fcd06>] ? _xfs_buf_find+0x56/0x280 [xfs] [531843.559637] [<ffffffffa01e5478>] _xfs_log_force+0x68/0x2a0 [xfs] [531843.559647] [<ffffffffa01fcef9>] ? _xfs_buf_find+0x249/0x280 [xfs] [531843.559658] [<ffffffffa01d91d9>] ? xfs_iunlock+0xe9/0x150 [xfs] [531843.559668] [<ffffffffa01dec87>] ? xfs_inode_item_pushbuf+0x87/0xe0 [xfs] [531843.559678] [<ffffffffa01e56c8>] xfs_log_force+0x18/0x40 [xfs] [531843.559688] [<ffffffffa01f37b4>] xfsaild_push+0x394/0x3a0 [xfs] [531843.559699] [<ffffffffa01f3822>] xfsaild+0x62/0xc0 [xfs] [531843.559708] [<ffffffffa01f37c0>] ? xfsaild_push+0x3a0/0x3a0 [xfs] [531843.559711] [<ffffffff81080dac>] kthread+0x8c/0xa0 [531843.559714] [<ffffffff81610ca4>] kernel_thread_helper+0x4/0x10 [531843.559717] [<ffffffff81080d20>] ? flush_kthread_worker+0xa0/0xa0 [531843.559720] [<ffffffff81610ca0>] ? gs_change+0x13/0x13 [531843.559725] INFO: task bonnie++:15912 blocked for more than 120 seconds. ... etc Additional testing ------------------ All three machines are still stuck in this state, and I can leave them like this for the next few days if necessary if there are any tests or anything that you can suggest I do to the systems while they're like this. The only thing I could think of doing was a 'dd' to see if the underlying drives are working: root 23604 0.0 0.0 11064 1688 pts/4 D+ 19:04 0:00 dd if=/dev/md127 of=/dev/null bs=1024k count=1024 That process hung. So I tried again with smaller ones: strace dd if=/dev/md127 of=/dev/null bs=1024k count=1 #ok strace dd if=/dev/md127 of=/dev/null bs=1024k count=2 #ok strace dd if=/dev/md127 of=/dev/null bs=1024k count=3 #ok strace dd if=/dev/md127 of=/dev/null bs=1024k count=4 #ok strace dd if=/dev/md127 of=/dev/null bs=1024k count=5 #hung The strace of the last process was like this: open("/dev/md127", O_RDONLY) = 3 dup2(3, 0) = 0 close(3) = 0 lseek(0, 0, SEEK_CUR) = 0 open("/dev/null", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3 dup2(3, 1) = 1 close(3) = 0 mmap(NULL, 1060864, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f6d5a1c7000 read(0, "XFSB\0\0\20\0\0\0\0\4\27\361\310\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 write(1, "XFSB\0\0\20\0\0\0\0\4\27\361\310\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 read(0, "INA\355\2\1\0\0\0\0\0\0\0\0\0\0\0\0\0\6\0\0\0\0\0\0\0\0\0\0\0\4"..., 1048576) = 1048576 write(1, "INA\355\2\1\0\0\0\0\0\0\0\0\0\0\0\0\0\6\0\0\0\0\0\0\0\0\0\0\0\4"..., 1048576) = 1048576 read(0, "IN\201\244\2\2\0\0\0\0\3\350\0\0\3\350\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\16"..., 1048576) = 1048576 write(1, "IN\201\244\2\2\0\0\0\0\3\350\0\0\3\350\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\16"..., 1048576) = 1048576 read(0, "p1\17\343>\203)\321\272\1\357\0\0\0\272\252V\206\377~\300\203\376\351\3\366\35)Zh[\25"..., 1048576) = 1048576 write(1, "p1\17\343>\203)\321\272\1\357\0\0\0\272\252V\206\377~\300\203\376\351\3\366\35)Zh[\25"..., 1048576) = 1048576 read(0, That is, it read 4 chunks OK and hung reading the 5th. This seems suspiciously like a drive has locked up, but there are no dmesg errors for sd[a-z] to suggest disk problems, whereas when I had a real failed disk on storage2 I had messages like this (captured previously): [ 1217.995083] sd 5:0:0:0: [sdr] Unhandled sense code [ 1217.995085] sd 5:0:0:0: [sdr] Result: hostbyte=invalid driverbyte=DRIVER_SENSE [ 1217.995089] sd 5:0:0:0: [sdr] Sense Key : Medium Error [current] [ 1217.995095] sd 5:0:0:0: [sdr] Add. Sense: Unrecovered read error [ 1217.995099] sd 5:0:0:0: [sdr] CDB: Read(10): 28 00 30 00 05 80 00 00 10 00 [ 1217.995108] end_request: critical target error, dev sdr, sector 805307776 Furthermore, the following command which makes a small transfer to every disk runs fine: # for i in sd{b..y}; do echo $i; dd if=/dev/$i of=/dev/null count=1; done But this one gets stuck at sdk: # for i in sd{b..y}; do echo $i; dd if=/dev/$i of=/dev/null bs=1024k count=1; done Final point: the dd and bonnie++ processes can't be killed even with -9. So, these machines are very sick, and in roughly the same way. And the fact I can get simple dd's to lock up implies it's *not* anything to do with XFS, so my apologies if this is the wrong forum. Fortunately the problem appears fairly easy to reproduce. They are identical hardware, in terms of chassis, PSU, motherboard, BIOS version; and they are the same OS. I had already flashed the newest LSI firmware into the HBAs: $ sudo ./sas2flash -listall LSI Corporation SAS2 Flash Utility Version 13.00.00.00 (2012.02.17) Copyright (c) 2008-2012 LSI Corporation. All rights reserved Adapter Selected is a LSI SAS: SAS2116_1(B1) Num Ctlr FW Ver NVDATA x86-BIOS PCI Addr ---------------------------------------------------------------------------- 0 SAS2116_1(B1) 13.00.01.00 0c.00.00.01 07.25.00.00 00:02:00:00 1 SAS2008(B2) 13.00.01.00 0c.00.00.05 07.25.00.00 00:03:00:00 Finished Processing Commands Successfully. Exiting SAS2Flash. (Maybe "newest" wasn't a good idea - it was released Feb 17 2012) I could try ubuntu 12.04 with its newer kernel - I'd be surprised if the LSI driver in 11.10 was as broken as this appears to be, but it's not impossible. LSI themselves only seem to provide driver packages for RH4/SLES9 and RH5/SLES10 at http://www.lsi.com/channel/support/products/Pages/LSISAS9201-16i.aspx (and I guess trying CentOS 5 is another option) Any other suggestions (and of course interpretation of the kernel call tracebacks) would be much appreciated. Thanks, Brian. _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs