md hang on updating bitmap

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I've been failure testing initiator-side mirroring, and occasionally if I hit things just right during a failure scenario, I can get md arrays stuck indefinitely. This seems to wedge the scsi controller(target on initiator) from cleanly disconnecting, even though it is trying furiously and failing IO. I'm on the latest kernel 3.10, and using the srp_backport driver with the fast_io_fail and dev_loss_tmo updates. I'm told by the srp devs that the target seems to be failing io and aborting as it should, but about 2 times out of 10 I can get the md threads stuck in D state forever. I'm just throwing this out there in case anyone has suggestions or knows what to try.

[69923.701603] md3_raid1 D ffffffff818089a0 5248 8760 2 0x00000080
[69923.709434]  ffff88020d511b58 0000000000000046 ffff880213f59020
0000000000013d80
[69923.717381]  ffff88020d511fd8 ffff88020d510010 0000000000013d80
0000000000013d80
[69923.725338]  ffff88020d511fd8 0000000000013d80 ffff880213f59020
ffff8802148458b0
[69923.733352] Call Trace:
[69923.741291]  [<ffffffff81761834>] schedule+0x24/0x70
[69923.749303]  [<ffffffff815e6745>] md_super_wait+0x55/0x90
[69923.757294]  [<ffffffff81092eb0>] ? wake_up_bit+0x40/0x40
[69923.765226]  [<ffffffff815f3b92>] write_page+0x1b2/0x370
[69923.773100]  [<ffffffff815f38c9>] bitmap_update_sb+0x119/0x120
[69923.780994]  [<ffffffff815eca85>] md_update_sb+0x245/0x650
[69923.788890]  [<ffffffff815f1d8a>] md_check_recovery+0x24a/0x4c0
[69923.796793]  [<ffffffffa02f06a2>] raid1d+0x32/0xf10 [raid1]
[69923.804729]  [<ffffffff8107c226>] ? try_to_del_timer_sync+0x56/0x70
[69923.812717]  [<ffffffff8107c29a>] ? del_timer_sync+0x5a/0x70
[69923.820565]  [<ffffffff8175f785>] ? schedule_timeout+0x135/0x210
[69923.828327]  [<ffffffff81044293>] ? default_spin_lock_flags+0x13/0x20
[69923.836134]  [<ffffffff81044293>] ? default_spin_lock_flags+0x13/0x20
[69923.843788]  [<ffffffff815ea12f>] md_thread+0x11f/0x170
[69923.851288]  [<ffffffff81092eb0>] ? wake_up_bit+0x40/0x40
[69923.858767]  [<ffffffff815ea010>] ? md_rdev_init+0x110/0x110
[69923.866238]  [<ffffffff81092806>] kthread+0xc6/0xd0
[69923.873689] [<ffffffff81092740>] ? kthread_freezable_should_stop+0x60/0x60
[69923.881236]  [<ffffffff8176b7fc>] ret_from_fork+0x7c/0xb0
[69923.888759] [<ffffffff81092740>] ? kthread_freezable_should_stop+0x60/0x60
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux