Re: iscsi-target OOPS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Matyas,

Apologies for the delayed response.  Comments are inline below.

On Wed, 2014-07-16 at 16:18 +0200, Matyas Koszik wrote:
> Hi,
> 
> I'm running a redundant storage server pair with DRBD and LIO. On the
> secondary node I had some RAID card resets, which triggered DRBD to
> disconnect the node - but before that, requests were held for several
> seconds in the DRBD layer, which seems to be a problem for LIO (or maybe
> the subsequent initiator reconnects - I'm not sure).
> 
> My efforts to reproduce the bug in a test environment failed so far, but
> in production it works reliably (the RAID card in the secondary node
> stalled 3 times and the primary node crashed every time). I've tried
> upgrading the kernel between crashes 1 and 2, but that didn't help.
> 

So there has been a number of bugfixes in this area that have gone in
for >= v3.10.y stable kernels, that due to a number of issues in older
code have not yet been backported to v3.2.y.

At this point, it's going to be fairly involved to get these backported
to v3.2.y, so I'd strongly recommend using v3.10.50, v3.12.25, v3.14.14
or v3.15.7 code for your production setup.

That doesn't mean these won't ever be addressed for v3.2.y code, but
it's certainly a lower priority atm.

Thanks,

--nab

> Regards,
> 
> Matyas
> 
> 
> 1st crash:
> 
> emerg kernel: [6995187.657152] general protection fault: 0000 [#3] SMP
> warning kernel: [6995187.657292] CPU 3
> warning kernel: [6995187.657333] Modules linked in: tcm_loop tcm_fc iscsi_target_mod target_core_pscsi target_core_file target_core_iblock target_core_mod nfnetlink_log nfnetlink dm_snapshot usb_storage parport_pc ppdev lp parport drbd lru_cache libfc scsi_transport_fc scsi_tgt configfs bnep rfcomm bluetooth rfkill uinput nfsd nfs nfs_acl auth_rpcgss fscache lockd sunrpc nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_tcpudp iptable_filter ip_tables x_tables loop snd_pcm snd_page_alloc snd_timer acpi_cpufreq iTCO_wdt iTCO_vendor_support joydev mperf i7core_edac processor snd soundcore psmouse i2c_i801 coretemp serio_raw ioatdma edac_core i2c_core pcspkr evdev thermal_sys button crc32c_intel ext4 crc16 jbd2 mbcache dm_mod raid456 async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq raid1 md_mod ses sg enclosure sd_mod crc_t10dif usbhidhid uhci_hcd megaraid_sas ahci libahci libata ehci_hcd scsi_mod usbcore usb_
> info kernel: common ixgbe mdio igb dca [last unloaded: target_core_mod]
> warning kernel: [6995187.661785]
> warning kernel: [6995187.661843] Pid: 19644, comm: iscsi_trx Tainted: G    B D      3.2.0-4-amd64 #1 Debian 3.2.51-1 Supermicro X8DT3/X8DT3
> warning kernel: [6995187.662081] RIP: 0010:[<ffffffffa04b69f1>]  [<ffffffffa04b69f1>] iscsit_find_cmd_from_ttt+0x32/0x8a [iscsi_target_mod]
> warning kernel: [6995187.662221] RSP: 0018:ffff880142d65de0  EFLAGS: 00010293
> warning kernel: [6995187.662287] RAX: ffff88030b89f368 RBX: ffff88030b89f000 RCX: 00000000207c4cea
> warning kernel: [6995187.662373] RDX: dead000000100100 RSI: 0000000000068e85 RDI: ffff88030b89f2e8
> warning kernel: [6995187.662459] RBP: 0000000000068e85 R08: 00000001686fd3a0 R09: ffff88032ff66380
> warning kernel: [6995187.662546] R10: ffff88032ff66380 R11: 0000000000013780 R12: dead0000000ffee0
> warning kernel: [6995187.662632] R13: ffff88030b89f2e8 R14: 0000000000000000 R15: 0000000000000000
> warning kernel: [6995187.662718] FS:  0000000000000000(0000) GS:ffff88033fc60000(0000) knlGS:0000000000000000
> warning kernel: [6995187.662807] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> warning kernel: [6995187.662873] CR2: 00007fddc3187130 CR3: 0000000001605000 CR4: 00000000000006e0
> warning kernel: [6995187.662960] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> warning kernel: [6995187.663045] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> warning kernel: [6995187.663131] Process iscsi_trx (pid: 19644, threadinfo ffff880142d64000, task ffff88032e1d0970)
> emerg kernel: [6995187.663220] Stack:
> warning kernel: [6995187.663278]  ffff88030b89f000 ffff88030b89f000 0000000000000000 ffff880142d65e88
> warning kernel: [6995187.663509]  0000000000000000 ffffffffa04bb1cf 0000000000000000 ffff88033fcb3780
> warning kernel: [6995187.663740]  ffff88032e1d0d30 0000000200000000 00000000814052a0 ffff880310e6b280
> emerg kernel: [6995187.663971] Call Trace:
> warning kernel: [6995187.664037]  [<ffffffffa04bb1cf>] ? iscsi_target_rx_thread+0x118d/0x1943 [iscsi_target_mod]
> warning kernel: [6995187.664131]  [<ffffffffa04ba042>] ? iscsit_thread_get_cpumask+0x88/0x88 [iscsi_target_mod]
> warning kernel: [6995187.664224]  [<ffffffff8105f631>] ? kthread+0x76/0x7e
> warning kernel: [6995187.664292]  [<ffffffff81356374>] ? kernel_thread_helper+0x4/0x10
> warning kernel: [6995187.664361]  [<ffffffff8105f5bb>] ? kthread_worker_fn+0x139/0x139
> warning kernel: [6995187.664430]  [<ffffffff81356370>] ? gs_change+0x13/0x13
> emerg kernel: [6995187.664495] Code: 00 00 41 54 55 89 f5 53 53 48 89 fb 4c 89 ef e8 93 89 e9 e0 4c 8b a3 68 03 00 00 48 8d 83 68 03 00 00 49 81 ec 20 02 00 00 eb 20 <41> 39 6c 24 24 75 0a 4c 89 ef e8 29 89 e9 e0 eb 3b4d 8b a4 24
> alert kernel: [6995187.666958] RIP  [<ffffffffa04b69f1>] iscsit_find_cmd_from_ttt+0x32/0x8a [iscsi_target_mod]
> warning kernel: [6995187.667088]  RSP <ffff880142d65de0>
> warning kernel: [6995187.667217] ---[ end trace b9be73a1398857b9 ]---
> emerg kernel: [6995187.667282] Kernel panic - not syncing: Fatal exception in interrupt
> 
> 2nd crash:
> 
> warning kernel: [568490.274380] block drbd0: Remote failed to finish a request within ko-count * timeout
> info kernel: [568490.274390] block drbd0: peer( Secondary -> Unknown ) conn( Connected -> Timeout ) pdsk( UpToDate -> DUnknown )
> info kernel: [568490.274539] block drbd0: new current UUID BC772B18F6FE7FB7:9E9A652A9DC0D857:7116C7324E030315:7115C7324E030315
> info kernel: [568490.274842] block drbd0: asender terminated
> info kernel: [568490.274849] block drbd0: Terminating drbd0_asender
> info kernel: [568490.275598] block drbd0: Connection closed
> info kernel: [568490.275605] block drbd0: conn( Timeout -> Unconnected )
> info kernel: [568490.275610] block drbd0: receiver terminated
> info kernel: [568490.275613] block drbd0: Restarting drbd0_receiver
> info kernel: [568490.275616] block drbd0: receiver (re)started
> info kernel: [568490.275621] block drbd0: conn( Unconnected -> WFConnection )
> emerg kernel: [568492.473906] general protection fault: 0000 [#1] SMP
> warning kernel: [568492.474040] CPU 0
> warning kernel: [568492.474085] Modules linked in: tcm_loop tcm_fc iscsi_target_mod target_core_pscsi target_core_file target_core_iblock target_core_mod netconsole drbd lru_cache libfc scsi_transport_fc scsi_tgt configfs bnep rfcomm parport_pc bluetooth ppdev rfkill lp parport uinput nfsd nfs nfs_acl auth_rpcgss fscache lockd sunrpc nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_tcpudp iptable_filter ip_tables x_tables loop snd_pcm snd_page_alloc snd_timer ioatdma iTCO_wdt psmouse i7core_edac edac_core snd soundcore pcspkr iTCO_vendor_support coretemp joydev evdev serio_raw acpi_cpufreq mperf processor i2c_i801 thermal_sys crc32c_intel button ext4 crc16 jbd2 mbcache dm_mod raid1 md_mod sg ses sd_mod enclosure crc_t10dif usbhid hid usb_storage uhci_hcd ahci libahci libata megaraid_sas ehci_hcd usbcore scsi_mod usb_common ixgbe igb mdio i2c_algo_bit i2c_core dca [last unloaded: target_core_mod]
> warning kernel: [568492.478364]
> warning kernel: [568492.478426] Pid: 5056, comm: iscsi_trx Not tainted 3.2.0-4-amd64 #1 Debian 3.2.57-3+deb7u2 Supermicro X8DT3/X8DT3
> warning kernel: [568492.478710] RIP: 0010:[<ffffffffa054ce4d>]  [<ffffffffa054ce4d>] iscsit_close_connection+0x5b/0x4bb [iscsi_target_mod]
> warning kernel: [568492.478858] RSP: 0018:ffff8803149ebdb0  EFLAGS: 00010202
> warning kernel: [568492.478927] RAX: dead000000100100 RBX: ffff880313c4f000 RCX: 0002050a77f690fd
> warning kernel: [568492.479017] RDX: 0000000000000001 RSI: ffff880314a3da98 RDI: ffffffffa054ce5f
> warning kernel: [568492.479107] RBP: ffff88032f14a000 R08: ffff8803149ea000 R09: ffffffff81600000
> warning kernel: [568492.479197] R10: ffffffff81600000 R11: ffffffff81600000 R12: ffff880313c4f2e8
> warning kernel: [568492.479287] R13: dead0000000ffee0 R14: ffff880313c4f208 R15: 0000000000000000
> warning kernel: [568492.479377] FS:  0000000000000000(0000) GS:ffff88033fc00000(0000) knlGS:0000000000000000
> warning kernel: [568492.479469] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> warning kernel: [568492.479539] CR2: 00007f375d7d03e0 CR3: 0000000001605000 CR4: 00000000000006f0
> warning kernel: [568492.479629] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> warning kernel: [568492.479718] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> warning kernel: [568492.479808] Process iscsi_trx (pid: 5056, threadinfo ffff8803149ea000, task ffff88031247d040)
> emerg kernel: [568492.479901] Stack:
> warning kernel: [568492.479961]  0000000000000030 0000000000000001 ffff880313c4f368 ffffffff8134fd9b
> warning kernel: [568492.480204]  0000000000000246 ffff880313c4f000 ffff88032f592580 ffff8803149ebe88
> warning kernel: [568492.480447]  0000000000000000 ffff880313c4f2e8 0000000000000000 ffffffffa054c948
> emerg kernel: [568492.480689] Call Trace:
> warning kernel: [568492.480757]  [<ffffffff8134fd9b>] ? _raw_spin_lock_bh+0xe/0x1c
> warning kernel: [568492.480836]  [<ffffffffa054c948>] ? iscsi_target_rx_thread+0x18e4/0x1930 [iscsi_target_mod]
> warning kernel: [568492.480936]  [<ffffffffa054b064>] ? iscsit_thread_get_cpumask+0x88/0x88 [iscsi_target_mod]
> warning kernel: [568492.481033]  [<ffffffff8105f6d9>] ? kthread+0x76/0x7e
> warning kernel: [568492.481105]  [<ffffffff81356d74>] ? kernel_thread_helper+0x4/0x10
> warning kernel: [568492.481179]  [<ffffffff8105f663>] ? kthread_worker_fn+0x139/0x139
> warning kernel: [568492.481251]  [<ffffffff81356d70>] ? gs_change+0x13/0x13
> emerg kernel: [568492.481319] Code: 89 df e8 4b 21 ff ff 4c 89 e7 e8 5c 2f e0 e0 4c 8b ab 68 03 00 00 48 8d 83 68 03 00 00 48 89 44 24 10 49 81 ed 20 02 00 00 eb 20 <41> 83 bd a0 00 00 00 01 75 08 4c 89 ef e8 1f 59 ff ff 4d 8b ad
> alert kernel: [568492.484018] RIP  [<ffffffffa054ce4d>] iscsit_close_connection+0x5b/0x4bb [iscsi_target_mod]
> warning kernel: [568492.484156]  RSP <ffff8803149ebdb0>
> warning kernel: [568492.484262] ---[ end trace 46ecbbe37bed4a2e ]---
> emerg kernel: [568492.484331] Kernel panic - not syncing: Fatal exception in interrupt
> warning kernel: [568492.484407] Pid: 5056, comm: iscsi_trx Tainted: G      D      3.2.0-4-amd64 #1 Debian 3.2.57-3+deb7u2
> warning kernel: [568492.484504] Call Trace:
> warning kernel: [568492.484570]  [<ffffffff813492d8>] ? panic+0x95/0x1a2
> 
> 
> 3rd crash:
> 
> warning kernel: [1022062.839595] block drbd0: Remote failed to finish a request within ko-count * timeout
> info kernel: [1022062.839604] block drbd0: peer( Secondary -> Unknown ) conn( Connected -> Timeout ) pdsk( UpToDate -> DUnknown )
> info kernel: [1022062.839697] block drbd0: new current UUID 9E9A652A9DC0D857:7115C7324E030315:8355E6AB18AD03EE:8354E6AB18AD03EF
> info kernel: [1022062.839957] block drbd0: asender terminated
> info kernel: [1022062.839962] block drbd0: Terminating drbd0_asender
> info kernel: [1022062.840629] block drbd0: Connection closed
> info kernel: [1022062.840636] block drbd0: conn( Timeout -> Unconnected )
> info kernel: [1022062.840641] block drbd0: receiver terminated
> info kernel: [1022062.840644] block drbd0: Restarting drbd0_receiver
> info kernel: [1022062.840647] block drbd0: receiver (re)started
> info kernel: [1022062.840651] block drbd0: conn( Unconnected -> WFConnection )
> emerg kernel: [1022062.840840] general protection fault: 0000 [#1] SMP warning kernel: [1022062.840986] CPU 5
> warning kernel: [1022062.841028] Modules linked in: tcm_loop tcm_fc iscsi_target_mod target_core_pscsi target_core_file target_core_iblock target_core_mod netconsole parport_pc ppdev bnep rfcomm bluetooth rfkill lp parport drbd lru_cache libfc scsi_transport_fc scsi_tgt configfs uinput nfsd nfs nfs_acl auth_rpcgss fscache lockd sunrpc nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_tcpudp iptable_filter ip_tables x_tables loop snd_pcm snd_page_alloc snd_timer ioatdma acpi_cpufreq psmouse i7core_edac iTCO_wdt serio_raw mperf edac_core iTCO_vendor_support coretemp joydev evdev snd soundcore pcspkr processor i2c_i801 thermal_sys crc32c_intel button ext4 crc16 jbd2 mbcache dm_mod raid1 md_mod sg sd_mod ses enclosure crc_t10dif usbhid hid usb_storage uhci_hcd megaraid_sas ahci libahci libata scsi_mod ehci_hcd usbcore usb_common ixgbe mdio igb i2c_algo_bit i2c_core dca [last unloaded: target_core_mod]
> warning kernel: [1022062.845110]
> warning kernel: [1022062.845169] Pid: 5165, comm: iscsi_trx Not tainted 3.2.0-4-amd64 #1 Debian 3.2.57-3+deb7u2 Supermicro X8DT3/X8DT3
> warning kernel: [1022062.845373] RIP: 0010:[<ffffffffa0442696>]  [<ffffffffa0442696>] core_tmr_release_req+0x2c/0x66 [target_core_mod]
> warning kernel: [1022062.845514] RSP: 0018:ffff880314badd80  EFLAGS: 00010046
> warning kernel: [1022062.845581] RAX: 0000000000000286 RBX: ffff8802ce58d7c0 RCX: dead000000100100
> warning kernel: [1022062.845668] RDX: dead000000200200 RSI: 0000000000000286 RDI: ffff880329dc45d4
> warning kernel: [1022062.845754] RBP: ffff880329dc45d4 R08: ffff880314badd60 R09: ffff8803155fc040
> warning kernel: [1022062.845840] R10: ffff8803155fc040 R11: ffff8803155fc040 R12: ffff88032f365ae8
> warning kernel: [1022062.845927] R13: ffff8803121d0ac0 R14: ffff88032f365a08 R15: ffff8803121d0ac0
> warning kernel: [1022062.846014] FS:  0000000000000000(0000) GS:ffff88033fca0000(0000) knlGS:0000000000000000
> warning kernel: [1022062.846103] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> warning kernel: [1022062.846170] CR2: 00007f73283f89de CR3: 0000000001605000 CR4: 00000000000006e0
> warning kernel: [1022062.846257] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> warning kernel: [1022062.846344] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> warning kernel: [1022062.846430] Process iscsi_trx (pid: 5165, threadinfo ffff880314bac000, task ffff88032e951780)
> emerg kernel: [1022062.846519] Stack:
> warning kernel: [1022062.846577]  0000000000000282 ffff8803156517c0 ffff8803151ea800 ffffffffa0445b6e
> warning kernel: [1022062.846813]  ffff88032f365800 ffffffffa053af12 0000000000000030 0000000000000001
> warning kernel: [1022062.847046]  ffff88032f365b68 ffff8803151ea800 0000000000000246 ffff88032f365800
> emerg kernel: [1022062.847280] Call Trace:
> warning kernel: [1022062.847349]  [<ffffffffa0445b6e>] ? transport_release_cmd+0x21/0x71 [target_core_mod]
> warning kernel: [1022062.847442]  [<ffffffffa053af12>] ? iscsit_close_connection+0x120/0x4bb [iscsi_target_mod]
> warning kernel: [1022062.847536]  [<ffffffffa053a948>] ? iscsi_target_rx_thread+0x18e4/0x1930 [iscsi_target_mod]
> warning kernel: [1022062.847630]  [<ffffffffa0539064>] ? iscsit_thread_get_cpumask+0x88/0x88 [iscsi_target_mod]
> warning kernel: [1022062.847723]  [<ffffffff8105f6d9>] ? kthread+0x76/0x7e
> warning kernel: [1022062.847792]  [<ffffffff81356d74>] ? kernel_thread_helper+0x4/0x10
> warning kernel: [1022062.847862]  [<ffffffff8105f663>] ? kthread_worker_fn+0x139/0x139
> warning kernel: [1022062.847931]  [<ffffffff81356d70>] ? gs_change+0x13/0x13
> emerg kernel: [1022062.847997] Code: 53 48 89 fb 56 48 8b 6f 30 48 85 ed 74 45 48 81 c5 d4 01 00 00 48 89 ef e8 98 d5 f0 e0 48 8b 4b 40 48 8b 53 48 48 89 c6 48 89 ef <48> 89 51 08 48 89 0a 48 ba 00 01 10 00 00 00 ad de 48 b9 00 02
> alert kernel: [1022062.850596] RIP  [<ffffffffa0442696>] core_tmr_release_req+0x2c/0x66 [target_core_mod]
> warning kernel: [1022062.856021]  RSP <ffff880314badd80>
> warning kernel: [1022062.856083] ---[ end trace 953259b8f570f942 ]---
> 
> --
> To unsubscribe from this list: send the line "unsubscribe target-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux SCSI]     [Kernel Newbies]     [Linux SCSI Target Infrastructure]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Device Mapper]

  Powered by Linux