Re: Update on crash with kernel 3.19

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I think I can replicate the same problem:

I do:

1. run IO from the initiator
2. force the backing store device to halt IO
3. let the initiator's timeout fire and its error handler run
4. wait until the initiator tries to relogin and let "iSCSI Login
timeout" fire
5. let the backing store device execute IO again

I am not certain if #5 is required. I think maybe it oopsed after #4. I
cannot remember. I have not had time to debug it yet. Maybe someone else
has seen it.


On 04/22/2015 04:12 PM, Robert Wood wrote:
>  I stress tested the system (16 parallel streams each
> to its own LUN from two different VM hosts) and I cannot break it with
> default_cmdsn_depths of 16,32 and 256.  I believe below is when the crash
> initiated last Sunday after some normal LUN probing by Vmware.  It is
> possible and likely that the timeout came from the ceph back end, but
> target still crashed.
> 
> Thank you,
> Alex
>>
>> Apr 19 03:30:00 roc-4r-scd212 kernel: [74483.295274] ABORT_TASK: Sending
>> TMR_FUNCTION_COMPLETE for ref_tag: 257197
>> Apr 19 03:30:00 roc-4r-scd212 kernel: [74483.295279] Unable to locate
>> ITT: 0x0003ecad on CID: 0
>> Apr 19 03:30:00 roc-4r-scd212 kernel: [74483.295279] ABORT_TASK: Sending
>> TMR_TASK_DOES_NOT_EXIST for ref_tag: 257197
>> Apr 19 03:30:00 roc-4r-scd212 kernel: [74483.295308] Unable to locate
>> RefTaskTag: 0x0003ecad on CID: 0.
>> Apr 19 03:30:02 roc-4r-scd212 kernel: [74486.177458] ABORT_TASK: Found
>> referenced iSCSI task_tag: 9072
>> Apr 19 03:30:02 roc-4r-scd212 kernel: [74486.177460] Unexpected ret: -32
>> send data 48
>> Apr 19 03:30:02 roc-4r-scd212 kernel: [74486.177488] ABORT_TASK: Sending
>> TMR_FUNCTION_COMPLETE for ref_tag: 9072
>> Apr 19 03:30:10 roc-4r-scd212 kernel: [74493.812492] Unexpected ret: -32
>> send data 48
>> Apr 19 03:30:12 roc-4r-scd212 kernel: [74495.813309] TARGET_CORE[iSCSI]:
>> Detected NON_EXISTENT_LUN Access for 0x00000015
>> Apr 19 03:30:12 roc-4r-scd212 kernel: [74495.987564] libceph: osd25
>> 10.80.3.30:6809 <http://10.80.3.30:6809> socket closed (con state OPEN)
>> Apr 19 03:30:15 roc-4r-scd212 kernel: [74498.326272] Unexpected ret: -32
>> send data 48
>> Apr 19 03:30:17 roc-4r-scd212 kernel: [74500.983349] iSCSI Login timeout
>> on Network Portal 10.70.2.211:3260 <http://10.70.2.211:3260>
>> Apr 19 03:30:29 roc-4r-scd212 kernel: [74513.167913] NMI watchdog: BUG:
>> soft lockup - CPU#9 stuck for 22s! [iscsi_trx:13963]
>> Apr 19 03:30:29 roc-4r-scd212 kernel: [74513.167959] Modules linked in:
>> target_core_user uio rbd libceph libcrc32c iscsi_target_mod
>> target_core_file target_core_pscsi target_core_iblock target_core_mod
>> configfs xt_multiport iptable_filter ip_tables x_tables
>> enhanceio_rand(OE) enhanceio_lru(OE) enhanceio_fifo(OE) enhanceio(OE)
>> ipmi_devintf ipmi_ssif x86_pkg_temp_thermal intel_powerclamp coretemp
>> kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel
>> aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd sb_edac joydev
>> edac_core mei_me mei lpc_ich ioatdma ipmi_si ipmi_msghandler 8250_fintek
>> wmi mac_hid 8021q garp mrp stp llc bonding lp parport nfsd auth_rpcgss
>> nfs_acl nfs lockd grace sunrpc fscache mlx4_en vxlan ip6_udp_tunnel
>> udp_tunnel hid_generic igb usbhid mpt2sas i2c_algo_bit ahci hid dca
>> mlx4_core raid_class ptp libahci scsi_transport_sas pps_core
>> Apr 19 03:30:29 roc-4r-scd212 kernel: [74513.168026] CPU: 9 PID: 13963
>> Comm: iscsi_trx Tainted: G         C OE  3.19.4-031904-generic #201504131440
>> Apr 19 03:30:29 roc-4r-scd212 kernel: [74513.168028] Hardware name:
>> Supermicro X9DRD-7LN4F(-JBOD)/X9DRD-EF/X9DRD-7LN4F, BIOS 3.0a 12/05/2013
>> Apr 19 03:30:29 roc-4r-scd212 kernel: [74513.168030] task:
>> ffff8808591a1d70 ti: ffff880859620000 task.ti: ffff880859620000
>> Apr 19 03:30:29 roc-4r-scd212 kernel: [74513.168032] RIP:
>> 0010:[<ffffffff8107ae70>]  [<ffffffff8107ae70>]
>> __local_bh_enable_ip+0x0/0x90
>> Apr 19 03:30:29 roc-4r-scd212 kernel: [74513.168041] RSP:
>> 0018:ffff880859623d70  EFLAGS: 00000216
>> Apr 19 03:30:29 roc-4r-scd212 kernel: [74513.168043] RAX:
>> ffff88104ae4de80 RBX: ffff8810582a5000 RCX: ffff88104ae4de80
>> Apr 19 03:30:29 roc-4r-scd212 kernel: [74513.168044] RDX:
>> ffff88104ae4de80 RSI: 0000000000000200 RDI: ffffffffc07cb25b
>> Apr 19 03:30:29 roc-4r-scd212 kernel: [74513.168046] RBP:
>> ffff880859623d78 R08: ffff88104ae4df50 R09: 0000000000000101
>> Apr 19 03:30:29 roc-4r-scd212 kernel: [74513.168048] R10:
>> 0000000000000001 R11: dead000000200200 R12: ffffffffc07c0f68
>> Apr 19 03:30:29 roc-4r-scd212 kernel: [74513.168049] R13:
>> ffff880859623d08 R14: ffffffff817d2e30 R15: ffff880859623cd8
>> Apr 19 03:30:29 roc-4r-scd212 kernel: [74513.168052] FS:
>>  0000000000000000(0000) GS:ffff88107fc60000(0000) knlGS:0000000000000000
>> Apr 19 03:30:29 roc-4r-scd212 kernel: [74513.168054] CS:  0010 DS: 0000
>> ES: 0000 CR0: 0000000080050033
>> Apr 19 03:30:29 roc-4r-scd212 kernel: [74513.168055] CR2:
>> 00007f8e74ed7e60 CR3: 0000000001c15000 CR4: 00000000001407e0
>> Apr 19 03:30:29 roc-4r-scd212 kernel: [74513.168057] Stack:
>> Apr 19 03:30:29 roc-4r-scd212 kernel: [74513.168059]  ffffffff817d2e30
>> ffff880859623dd8 ffffffffc07cb25b 0000000000000000
>> Apr 19 03:30:29 roc-4r-scd212 kernel: [74513.168062]  ffff880842942000
>> ffff8810582a53e0 ffff8810582a5440 ffff8808591a1d70
>> Apr 19 03:30:29 roc-4r-scd212 kernel: [74513.168065]  ffff8810582a5000
>> ffff8810582a53f4 ffff880859623e4c ffff8808591a1d70
>> Apr 19 03:30:29 roc-4r-scd212 kernel: [74513.168069] Call Trace:
>> Apr 19 03:30:29 roc-4r-scd212 kernel: [74513.168076]
>>  [<ffffffff817d2e30>] ? _raw_spin_unlock_bh+0x20/0x50
>> Apr 19 03:30:29 roc-4r-scd212 kernel: [74513.168094]
>>  [<ffffffffc07cb25b>] iscsit_close_connection+0x3ab/0x660 [iscsi_target_mod]
>> Apr 19 03:30:29 roc-4r-scd212 kernel: [74513.168102]
>>  [<ffffffffc07b69b3>] iscsit_take_action_for_connection_exit+0x83/0x110
>> [iscsi_target_mod]
>> Apr 19 03:30:29 roc-4r-scd212 kernel: [74513.168111]
>>  [<ffffffffc07ca33e>] iscsi_target_rx_thread+0x22e/0x320 [iscsi_target_mod]
>> Apr 19 03:30:29 roc-4r-scd212 kernel: [74513.168119]
>>  [<ffffffffc07ca110>] ? iscsi_target_tx_thread+0x220/0x220
>> [iscsi_target_mod]
>> Apr 19 03:30:29 roc-4r-scd212 kernel: [74513.168124]
>>  [<ffffffff81095e29>] kthread+0xc9/0xe0
>> Apr 19 03:30:29 roc-4r-scd212 kernel: [74513.168128]
>>  [<ffffffff81095d60>] ? flush_kthread_worker+0x90/0x90
>> Apr 19 03:30:29 roc-4r-scd212 kernel: [74513.168132]
>>  [<ffffffff817d3718>] ret_from_fork+0x58/0x90
>> Apr 19 03:30:29 roc-4r-scd212 kernel: [74513.168135]
>>  [<ffffffff81095d60>] ? flush_kthread_worker+0x90/0x90
>> Apr 19 03:30:29 roc-4r-scd212 kernel: [74513.168136] Code: 1f 44 00 00
>> 65 8b 05 70 64 f9 7e 85 c0 75 14 48 89 df 57 9d 0f 1f 44 00 00 48 83 c4
>> 08 5b 5d c3 0f 1f 00 e8 b3 a6 75 00 eb e5 90 <0f> 1f 44 00 00 55 48 89
>> e5 53 89 f3 48 83 ec 08 65 8b 05 19 0a
>> Apr 19 03:30:35 roc-4r-scd212 kernel: [74519.038238] ABORT_TASK: Found
>> referenced iSCSI task_tag: 9823
>> Apr 19 03:30:35 roc-4r-scd212 kernel: [74519.038241] ABORT_TASK:
>> ref_tag: 9823 already complete, skipping
>> Apr 19 03:30:35 roc-4r-scd212 kernel: [74519.038243] ABORT_TASK: Sending
>> TMR_TASK_DOES_NOT_EXIST for ref_tag: 9823
>> Apr 19 03:30:35 roc-4r-scd212 kernel: [74519.038249] ABORT_TASK: Found
>> referenced iSCSI task_tag: 9821
>> Apr 19 03:30:35 roc-4r-scd212 kernel: [74519.038256] ABORT_TASK: Sending
>> TMR_FUNCTION_COMPLETE for ref_tag: 9821
>> Apr 19 03:30:35 roc-4r-scd212 kernel: [74519.038258] ABORT_TASK: Found
>> referenced iSCSI task_tag: 9822
>> Apr 19 03:30:35 roc-4r-scd212 kernel: [74519.038259] ABORT_TASK:
>> ref_tag: 9822 already complete, skipping
>> Apr 19 03:30:35 roc-4r-scd212 kernel: [74519.038261] ABORT_TASK: Sending
>> TMR_TASK_DOES_NOT_EXIST for ref_tag: 9822
>> Apr 19 03:30:52 roc-4r-scd212 kernel: [74535.812704] ABORT_TASK: Found
>> referenced iSCSI task_tag: 195485
>> Apr 19 03:30:52 roc-4r-scd212 kernel: [74535.812708] ABORT_TASK:
>> ref_tag: 195485 already complete, skipping
>> Apr 19 03:30:52 roc-4r-scd212 kernel: [74535.812709] ABORT_TASK: Sending
>> TMR_TASK_DOES_NOT_EXIST for ref_tag: 195485
>> Apr 19 03:30:52 roc-4r-scd212 kernel: [74535.812711] ABORT_TASK: Found
>> referenced iSCSI task_tag: 195483
>> Apr 19 03:30:52 roc-4r-scd212 kernel: [74535.812713] ABORT_TASK:
>> ref_tag: 195483 already complete, skipping
>> Apr 19 03:30:52 roc-4r-scd212 kernel: [74535.812714] ABORT_TASK: Sending
>> TMR_TASK_DOES_NOT_EXIST for ref_tag: 195483
>> Apr 19 03:30:52 roc-4r-scd212 kernel: [74535.812717] ABORT_TASK: Found
>> referenced iSCSI task_tag: 195484
>> Apr 19 03:30:52 roc-4r-scd212 kernel: [74535.812720] ABORT_TASK:
>> ref_tag: 195484 already complete, skipping
>> Apr 19 03:30:52 roc-4r-scd212 kernel: [74535.812721] ABORT_TASK: Sending
>> TMR_TASK_DOES_NOT_EXIST for ref_tag: 195484
>> Apr 19 03:30:52 roc-4r-scd212 kernel: [74535.812729] ABORT_TASK: Found
>> referenced iSCSI task_tag: 195483
>> Apr 19 03:30:52 roc-4r-scd212 kernel: [74535.812731] ABORT_TASK:
>> ref_tag: 195483 already complete, skipping
>> Apr 19 03:30:52 roc-4r-scd212 kernel: [74535.812732] ABORT_TASK: Sending
>> TMR_TASK_DOES_NOT_EXIST for ref_tag: 195483
>> Apr 19 03:30:52 roc-4r-scd212 kernel: [74535.812733] ABORT_TASK: Found
>> referenced iSCSI task_tag: 195482
>> Apr 19 03:30:52 roc-4r-scd212 kernel: [74535.812736] ABORT_TASK: Found
>> referenced iSCSI task_tag: 195485
>> Apr 19 03:30:52 roc-4r-scd212 kernel: [74535.812737] ABORT_TASK:
>> ref_tag: 195485 already complete, skipping
>>
> --
> To unsubscribe from this list: send the line "unsubscribe target-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux SCSI]     [Kernel Newbies]     [Linux SCSI Target Infrastructure]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Device Mapper]

  Powered by Linux