Re: 3.12.5 Target Errors

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 5/15/2014 7:35 PM, Moussa Ba (moussaba) wrote:
We are deploying a test environment in house and received complaints of failures while installing OSes on target LUNs.  Looking at the target machine, we observed the following errors in dmesg, these are repeated.  Is this a known issue? has it been fixed?

Hey Moussa,

I can say that since kernel 3.12.5 ib_isert was added with some important stability fixes.

The below list corruption seems to originate in the TX coalescing work that was done by Nic.

Does your kernel have the below commit applied? (although I don't know if that went to 3.12 stable kernels...)
commit ebbe442183b7b8192c963266f1c89048fefc63a5
Author: Nicholas Bellinger <nab@xxxxxxxxxxxxxxx>
Date:   Sun Mar 2 14:51:12 2014 -0800

    iser-target: Fix command leak for tx_desc->comp_llnode_batch

    This patch addresses a number of active I/O shutdown issues
    related to isert_cmd descriptors being leaked that are part
    of a completion interrupt coalescing batch.

    This includes adding logic in isert_cq_tx_comp_err() to
    drain any associated tx_desc->comp_llnode_batch, as well
    as isert_cq_drain_comp_llist() to drain any associated
    isert_conn->conn_comp_llist.

    Also, set tx_desc->llnode_active in isert_init_send_wr()
    in order to determine when work requests need to be skipped
    in isert_cq_tx_work() exception path code.

    Finally, update isert_init_send_wr() to only allow interrupt
    coalescing when ISER_CONN_UP.

    Acked-by: Sagi Grimberg <sagig@xxxxxxxxxxxx>
    Cc: Or Gerlitz <ogerlitz@xxxxxxxxxxxx>
    Cc: <stable@xxxxxxxxxxxxxxx> #3.13+
    Signed-off-by: Nicholas Bellinger <nab@xxxxxxxxxxxxxxx>

Moreover, a more detailed scenario may also help...

Cheers,
Sagi.

Moussa

[190660.362733] Unknown RDMA CMA event: 15
[190668.946341] Unknown RDMA CMA event: 15
[190676.147304] iSCSI Login timeout on Network Portal 192.168.252.101:3269
[190676.147365] iSCSI Login negotiation failed.
[190676.160321] Unknown RDMA CMA event: 8
[190691.172622] iSCSI Login timeout on Network Portal 192.168.252.101:3269
[190691.172685] iSCSI Login negotiation failed.
[190711.622169] ------------[ cut here ]------------
[190711.622190] WARNING: CPU: 0 PID: 6780 at lib/list_debug.c:53 __list_del_entry+0x63/0xd0()
[190711.622192] list_del corruption, ffff880444de29d0->next is LIST_POISON1 (dead000000100100)
[190711.622192] Modules linked in: ib_srpt tcm_qla2xxx qla2xxx tcm_loop tcm_fc libfc scsi_transport_fc scsi_tgt ib_isert rdma_cm
iw_cm ib_addr iscsi_target_mod target_core_pscsi target_core_file target_core_iblock target_core_mod ebtable_nat ebtables configf
s ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 ipt_REJECT xt_CHECKSUM iptable_mangle iptable_fi
lter ip_tables bridge stp llc autofs4 sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter i
p6_tables ipv6 ib_ipoib ib_cm ib_uverbs ib_umad mlx4_en mlx4_ib ib_sa ib_mad ib_core mlx4_core dm_mirror dm_region_hash dm_log dm
_mod vhost_net macvtap macvlan vhost tun kvm_intel kvm uinput iTCO_wdt iTCO_vendor_support microcode serio_raw pcspkr sb_edac eda
c_core i2c_i801 sg lpc_ich mfd_core igb i2c_algo_bit i2c_core ptp pps_core mtip32xx(OF) ioatdma dca wmi ext3(F) jbd(F) mbcache(F)
  sd_mod(F) crc_t10dif(F) crct10dif_common(F) ahci(F) libahci(F) isci(F) libsas(F) scsi_transport_sas(F) [last unloaded: speedstep
_lib]
[190711.622252] CPU: 0 PID: 6780 Comm: kworker/u33:5 Tainted: GF          O 3.12.5+ #1
[190711.622254] Hardware name: Supermicro X9DRX+-F/X9DRX+-F, BIOS 3.00 07/09/2013
[190711.622258] Workqueue: isert_comp_wq isert_cq_tx_work [ib_isert]
[190711.622260]  0000000000000035 ffff88083afb5c28 ffffffff81552ef7 0000000000000035
[190711.622263]  ffff88083afb5c78 ffff88083afb5c68 ffffffff8104d1ac ffff88083afb5c78
[190711.622265]  ffff880444de29d0 ffff880444de2c90 ffff88045f608000 ffff8804581773e0
[190711.622267] Call Trace:
[190711.622272]  [<ffffffff81552ef7>] dump_stack+0x49/0x62
[190711.622275]  [<ffffffff8104d1ac>] warn_slowpath_common+0x8c/0xc0
[190711.622277]  [<ffffffff8104d296>] warn_slowpath_fmt+0x46/0x50
[190711.622279]  [<ffffffff8127fe13>] __list_del_entry+0x63/0xd0
[190711.622281]  [<ffffffff8127fe91>] list_del+0x11/0x40
[190711.622284]  [<ffffffffa076bcf4>] isert_put_cmd+0xb4/0x1e0 [ib_isert]
[190711.622287]  [<ffffffffa076eede>] isert_completion_put+0x6e/0xe0 [ib_isert]
[190711.622290]  [<ffffffffa076f11c>] isert_cq_comp_err+0x2c/0xd0 [ib_isert]
[190711.622292]  [<ffffffffa076f5c9>] isert_cq_tx_work+0x89/0x110 [ib_isert]
[190711.622295]  [<ffffffff81068553>] process_one_work+0x183/0x490
[190711.622297]  [<ffffffff81069a2f>] worker_thread+0x11f/0x3a0
[190711.622299]  [<ffffffff81069910>] ? manage_workers+0x160/0x160
[190711.622302]  [<ffffffff8106f94e>] kthread+0xce/0xe0
[190711.622304]  [<ffffffff8106f880>] ? kthread_freezable_should_stop+0x70/0x70
[190711.622312]  [<ffffffff8155f66c>] ret_from_fork+0x7c/0xb0
[190711.622314]  [<ffffffff8106f880>] ? kthread_freezable_should_stop+0x70/0x70
[190711.622315] ---[ end trace b0c0a2e3b5c1820b ]---
[190711.622318] ------------[ cut here ]------------
[190711.622320] WARNING: CPU: 0 PID: 6780 at lib/list_debug.c:53 __list_del_entry+0x63/0xd0()
[190711.622321] list_del corruption, ffff880444ddca70->next is LIST_POISON1 (dead000000100100)
[190711.622321] Modules linked in: ib_srpt tcm_qla2xxx qla2xxx tcm_loop tcm_fc libfc scsi_transport_fc scsi_tgt ib_isert rdma_cm
iw_cm ib_addr iscsi_target_mod target_core_pscsi target_core_file target_core_iblock target_core_mod ebtable_nat ebtables configf
s ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 ipt_REJECT xt_CHECKSUM iptable_mangle iptable_fi
lter ip_tables bridge stp llc autofs4 sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter i
p6_tables ipv6 ib_ipoib ib_cm ib_uverbs ib_umad mlx4_en mlx4_ib ib_sa ib_mad ib_core mlx4_core dm_mirror dm_region_hash dm_log dm
_mod vhost_net macvtap macvlan vhost tun kvm_intel kvm uinput iTCO_wdt iTCO_vendor_support microcode serio_raw pcspkr sb_edac eda
c_core i2c_i801 sg lpc_ich mfd_core igb i2c_algo_bit i2c_core ptp pps_core mtip32xx(OF) ioatdma dca wmi ext3(F) jbd(F) mbcache(F)
  sd_mod(F) crc_t10dif(F) crct10dif_common(F) ahci(F) libahci(F) isci(F) libsas(F) scsi_transport_sas(F) [last unloaded: speedstep
_lib]
[190711.622353] CPU: 0 PID: 6780 Comm: kworker/u33:5 Tainted: GF       W  O 3.12.5+ #1
[190711.622354] Hardware name: Supermicro X9DRX+-F/X9DRX+-F, BIOS 3.00 07/09/2013
[190711.622356] Workqueue: isert_comp_wq isert_cq_tx_work [ib_isert]
[190711.622357]  0000000000000035 ffff88083afb5c28 ffffffff81552ef7 0000000000000035
[190711.622359]  ffff88083afb5c78 ffff88083afb5c68 ffffffff8104d1ac ffff88083afb5c78
[190711.622361]  ffff880444ddca70 ffff880444ddcd30 ffff88045f608000 ffff8804581773e0
[190711.622363] Call Trace:
[190711.622365]  [<ffffffff81552ef7>] dump_stack+0x49/0x62
[190711.622367]  [<ffffffff8104d1ac>] warn_slowpath_common+0x8c/0xc0
[190711.622368]  [<ffffffff8104d296>] warn_slowpath_fmt+0x46/0x50
[190711.622371]  [<ffffffff8127fe13>] __list_del_entry+0x63/0xd0
[190711.622373]  [<ffffffff8127fe91>] list_del+0x11/0x40
[190711.622375]  [<ffffffffa076bcf4>] isert_put_cmd+0xb4/0x1e0 [ib_isert]
[190711.622377]  [<ffffffffa076eede>] isert_completion_put+0x6e/0xe0 [ib_isert]
[190711.622380]  [<ffffffffa076f11c>] isert_cq_comp_err+0x2c/0xd0 [ib_isert]
[190711.622382]  [<ffffffffa076f5c9>] isert_cq_tx_work+0x89/0x110 [ib_isert]
[190711.622384]  [<ffffffff81068553>] process_one_work+0x183/0x490
[190711.622386]  [<ffffffff81069a2f>] worker_thread+0x11f/0x3a0
[190711.622387]  [<ffffffff81069910>] ? manage_workers+0x160/0x160
[190711.622389]  [<ffffffff8106f94e>] kthread+0xce/0xe0
[190711.622391]  [<ffffffff8106f880>] ? kthread_freezable_should_stop+0x70/0x70
[190711.622393]  [<ffffffff8155f66c>] ret_from_fork+0x7c/0xb0
[190711.622395]  [<ffffffff8106f880>] ? kthread_freezable_should_stop+0x70/0x70
[190711.622396] ---[ end trace b0c0a2e3b5c1820c ]---
[190711.622398] ------------[ cut here ]------------
[190711.622400] WARNING: CPU: 0 PID: 6780 at lib/list_debug.c:53 __list_del_entry+0x63/0xd0()
[190711.622401] list_del corruption, ffff880444dc5a90->next is LIST_POISON1 (dead000000100100)
[190711.622402] Modules linked in: ib_srpt tcm_qla2xxx qla2xxx tcm_loop tcm_fc libfc scsi_transport_fc scsi_tgt ib_isert rdma_cm
iw_cm ib_addr iscsi_target_mod target_core_pscsi target_core_file target_core_iblock target_core_mod ebtable_nat ebtables configf
s ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 ipt_REJECT xt_CHECKSUM iptable_mangle iptable_fi
lter ip_tables bridge stp llc autofs4 sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter i
p6_tables ipv6 ib_ipoib ib_cm ib_uverbs ib_umad mlx4_en mlx4_ib ib_sa ib_mad ib_core mlx4_core dm_mirror dm_region_hash dm_log dm
_mod vhost_net macvtap macvlan vhost tun kvm_intel kvm uinput iTCO_wdt iTCO_vendor_support microcode serio_raw pcspkr sb_edac eda
c_core i2c_i801 sg lpc_ich mfd_core igb i2c_algo_bit i2c_core ptp pps_core mtip32xx(OF) ioatdma dca wmi ext3(F) jbd(F) mbcache(F)
  sd_mod(F) crc_t10dif(F) crct10dif_common(F) ahci(F) libahci(F) isci(F) libsas(F) scsi_transport_sas(F) [last unloaded: speedstep
_lib]
[190711.622433] CPU: 0 PID: 6780 Comm: kworker/u33:5 Tainted: GF       W  O 3.12.5+ #1
[190711.622433] Hardware name: Supermicro X9DRX+-F/X9DRX+-F, BIOS 3.00 07/09/2013
[190711.622435] Workqueue: isert_comp_wq isert_cq_tx_work [ib_isert]
[190711.622436]  0000000000000035 ffff88083afb5c28 ffffffff81552ef7 0000000000000035
[190711.622438]  ffff88083afb5c78 ffff88083afb5c68 ffffffff8104d1ac ffff88083afb5c78
[190711.622440]  ffff880444dc5a90 ffff880444dc5d50 ffff88045f608000 ffff8804581773e0
[190711.622442] Call Trace:
[190711.622444]  [<ffffffff81552ef7>] dump_stack+0x49/0x62
[190711.622446]  [<ffffffff8104d1ac>] warn_slowpath_common+0x8c/0xc0
[190711.622448]  [<ffffffff8104d296>] warn_slowpath_fmt+0x46/0x50
[190711.622450]  [<ffffffff8127fe13>] __list_del_entry+0x63/0xd0
[190711.622452]  [<ffffffff8127fe91>] list_del+0x11/0x40
[190711.622454]  [<ffffffffa076bcf4>] isert_put_cmd+0xb4/0x1e0 [ib_isert]
[190711.622456]  [<ffffffffa076eede>] isert_completion_put+0x6e/0xe0 [ib_isert]
[190711.622459]  [<ffffffffa076f11c>] isert_cq_comp_err+0x2c/0xd0 [ib_isert]
[190711.622461]  [<ffffffffa076f5c9>] isert_cq_tx_work+0x89/0x110 [ib_isert]
[190711.622463]  [<ffffffff81068553>] process_one_work+0x183/0x490
[190711.622465]  [<ffffffff81069a2f>] worker_thread+0x11f/0x3a0
[190711.622466]  [<ffffffff81069910>] ? manage_workers+0x160/0x160
[190711.622468]  [<ffffffff8106f94e>] kthread+0xce/0xe0
[190711.622470]  [<ffffffff8106f880>] ? kthread_freezable_should_stop+0x70/0x70
[190711.622472]  [<ffffffff8155f66c>] ret_from_fork+0x7c/0xb0
[190711.622474]  [<ffffffff8106f880>] ? kthread_freezable_should_stop+0x70/0x70
[190711.622475] ---[ end trace b0c0a2e3b5c1820d ]---
[190711.622477] ------------[ cut here ]------------
--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux SCSI]     [Kernel Newbies]     [Linux SCSI Target Infrastructure]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Device Mapper]

  Powered by Linux