NeilBrown <neilb@xxxxxxx> writes: > On Tue, 05 Mar 2013 09:44:54 +0100 Jes Sorensen <Jes.Sorensen@xxxxxxxxxx> > wrote: >> > Does this fix it? >> > >> > NeilBrown >> >> Unfortunately no, I still see these crashes with this one applied :( >> > > Thanks - the symptom looked similar, but now that I look more closely I can > see it is quite different. > > How about this then? I can't really see what is happening, but based on the > patch that you identified it must be related to these flags. > It seems that handle_stripe_clean_event() is being called to early, and it > doesn't clear out the ->written bios because they are still locked or > something. But it does clear R5_Discard on the parity block, so > handle_stripe_clean_event doesn't get called again. > > This makes the handling of the various flags somewhat more uniform, which is > probably a good thing. Hi Neil, With this one applied I end up with an OOPS instead. Note I had to modify the last test/clear bit sequence to use &sh->dev[i].flags instead of &dev->flags to avoid a compiler warning. I am attaching the test script I am running too. It was written by Eryu Guan. Cheers, Jes [ 2623.554780] kernel BUG at drivers/md/raid5.c:2954! [ 2623.560126] invalid opcode: 0000 [#1] SMP [ 2623.564722] Modules linked in: raid456 async_raid6_recov async_memcpy async_pq raid6_pq async_xor xor async_tx nls_utf8 lockd sunrpc bnep bluetooth rfkill sg dm_mirror dm_region_hash dm_log dm_mod raid1 coretemp kvm_intel kvm crc32c_intel iTCO_wdt ghash_clmulni_intel e1000e iTCO_vendor_support lpc_ich microcode mfd_core i2c_i801 video pcspkr uinput xfs mgag200 i2c_algo_bit drm_kms_helper ttm drm mpt2sas i2c_core raid_class scsi_transport_sas usb_storage [last unloaded: raid456] [ 2623.612586] CPU 3 [ 2623.614639] Pid: 20177, comm: md42_raid5 Not tainted 3.7.0-rc1+ #17 Intel Corporation S1200BTL/S1200BTL [ 2623.625329] RIP: 0010:[<ffffffffa0438dd7>] [<ffffffffa0438dd7>] handle_stripe+0x2297/0x2320 [raid456] [ 2623.635732] RSP: 0018:ffff8801dd70db68 EFLAGS: 00010246 [ 2623.641660] RAX: ffff8801fc62cf18 RBX: ffff8801fc62cbf8 RCX: 0000000000000001 [ 2623.649623] RDX: 0000000000000000 RSI: 0000000000008d88 RDI: ffff8801edb63e00 [ 2623.657585] RBP: ffff8801dd70dcb8 R08: 0000000000000000 R09: ffff8801fc62cb10 [ 2623.665547] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001 [ 2623.673509] R13: ffff8801fc62cbf8 R14: 0000000000000000 R15: 0000000000000001 [ 2623.681472] FS: 0000000000000000(0000) GS:ffff880236860000(0000) knlGS:0000000000000000 [ 2623.690503] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2623.696915] CR2: 00007fb484fcc950 CR3: 00000000018fd000 CR4: 00000000001407e0 [ 2623.704878] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 2623.712841] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 2623.720804] Process md42_raid5 (pid: 20177, threadinfo ffff8801dd70c000, task ffff88022fadcbf0) [ 2623.730512] Stack: [ 2623.732757] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 2623.741067] ffff880232900400 00000001002386d6 ffff880236874100 0000000000000003 [ 2623.749376] ffff8801dd70dcb8 ffff8801fc62cc38 ffff8801edb63f78 ffff8801edb63f60 [ 2623.757686] Call Trace: [ 2623.760419] [<ffffffffa0439c1e>] handle_active_stripes+0x18e/0x2a0 [raid456] [ 2623.768387] [<ffffffffa043a79b>] raid5d+0x43b/0x5a0 [raid456] [ 2623.774902] [<ffffffff814a6acd>] md_thread+0x10d/0x140 [ 2623.780736] [<ffffffff81084210>] ? wake_up_bit+0x40/0x40 [ 2623.786764] [<ffffffff814a69c0>] ? md_rdev_init+0x140/0x140 [ 2623.793081] [<ffffffff81083810>] kthread+0xc0/0xd0 [ 2623.798529] [<ffffffff81083750>] ? kthread_create_on_node+0x120/0x120 [ 2623.805815] [<ffffffff8161e6ac>] ret_from_fork+0x7c/0xb0 [ 2623.811842] [<ffffffff81083750>] ? kthread_create_on_node+0x120/0x120 [ 2623.819126] Code: 83 be a4 00 00 00 00 74 0e e8 a6 39 07 e1 e9 21 de ff ff 0f 0b 0f 0b e8 58 ad ff ff 0f 1f 84 00 00 00 00 00 e9 0b de ff ff 0f 0b <0f> 0b 8b 43 58 44 8b 43 48 48 c7 c6 88 e1 43 a0 44 0f bf 4b 38 [ 2623.841056] RIP [<ffffffffa0438dd7>] handle_stripe+0x2297/0x2320 [raid456] [ 2623.848840] RSP <ffff8801dd70db68>
Attachment:
md-2.sh
Description: Bourne shell script