Re: Update on crash with kernel 3.19

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



We now think this may be related to a problem network switch, so here
is the latest dump for LIO and I wonder if this gives any insight as
to why it would crash rather than keep retrying:

May  7 23:26:16 roc-4r-scd212 kernel: [129694.169005] NMI watchdog:
BUG: soft lockup - CPU#4 stuck for 23s! [iscsi_ttx:5831]
May  7 23:26:16 roc-4r-scd212 kernel: [129694.169053] Modules linked
in: rbd libceph libcrc32c iscsi_target_mod target_core_file
target_core_pscsi target_core_iblock target_core_mod configfs
xt_multiport iptable_filter ip_tables x_tables enhanceio_rand(OE)
enhanceio_lru(OE) enhanceio_fifo(OE) enhanceio(OE) ipmi_devintf
ipmi_ssif x86_pkg_temp_thermal intel_powerclamp coretemp kvm
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel
aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd sb_edac joydev
edac_core mei_me nfsd mei lpc_ich auth_rpcgss ioatdma nfs_acl ipmi_si
nfs ipmi_msghandler 8250_fintek lockd 8021q grace garp mrp stp sunrpc
llc wmi bonding fscache mac_hid lp parport mlx4_en vxlan
ip6_udp_tunnel udp_tunnel hid_generic igb ahci mpt2sas i2c_algo_bit
usbhid libahci dca hid ptp raid_class mlx4_core scsi_transport_sas
pps_core
May  7 23:26:16 roc-4r-scd212 kernel: [129694.169116] CPU: 4 PID: 5831
Comm: iscsi_ttx Tainted: G         C OEL  4.1.0-040100rc2-generic
#201505032335
May  7 23:26:16 roc-4r-scd212 kernel: [129694.169118] Hardware name:
Supermicro X9DRD-7LN4F(-JBOD)/X9DRD-EF/X9DRD-7LN4F, BIOS 3.0a
12/05/2013
May  7 23:26:16 roc-4r-scd212 kernel: [129694.169120] task:
ffff8810590b2840 ti: ffff88085a988000 task.ti: ffff88085a988000
May  7 23:26:16 roc-4r-scd212 kernel: [129694.169122] RIP:
0010:[<ffffffff8180374b>]  [<ffffffff8180374b>]
_raw_spin_unlock_irqrestore+0x1b/0x50
May  7 23:26:16 roc-4r-scd212 kernel: [129694.169131] RSP:
0018:ffff88085a98bd00  EFLAGS: 00000286
May  7 23:26:16 roc-4r-scd212 kernel: [129694.169133] RAX:
0000000000000091 RBX: ffff8808491af440 RCX: ffff8808491af440
May  7 23:26:16 roc-4r-scd212 kernel: [129694.169135] RDX:
0000000000008614 RSI: 0000000000000286 RDI: 0000000000000286
May  7 23:26:16 roc-4r-scd212 kernel: [129694.169136] RBP:
ffff88085a98bd08 R08: ffff8808491af510 R09: 0000000000000101
May  7 23:26:16 roc-4r-scd212 kernel: [129694.169138] R10:
0000000000000004 R11: dead000000200200 R12: ffff8808491af510
May  7 23:26:16 roc-4r-scd212 kernel: [129694.169140] R13:
0000000000000101 R14: 0000000000000004 R15: dead000000200200
May  7 23:26:16 roc-4r-scd212 kernel: [129694.169143] FS:
0000000000000000(0000) GS:ffff88085fb00000(0000)
knlGS:0000000000000000
May  7 23:26:16 roc-4r-scd212 kernel: [129694.169145] CS:  0010 DS:
0000 ES: 0000 CR0: 0000000080050033
May  7 23:26:16 roc-4r-scd212 kernel: [129694.169147] CR2:
00007f528c2ebe60 CR3: 0000000001e0f000 CR4: 00000000001407e0
May  7 23:26:16 roc-4r-scd212 kernel: [129694.169148] Stack:
May  7 23:26:16 roc-4r-scd212 kernel: [129694.169150]
ffff8808491af450 ffff88085a98bd48 ffffffffc0565e0b ffff88085a98bd48
May  7 23:26:16 roc-4r-scd212 kernel: [129694.169153]
ffffffffc05bf418 ffff8808491af240 ffff8808491af450 0000000000000001
May  7 23:26:16 roc-4r-scd212 kernel: [129694.169156]
0000000000000001 ffff88085a98bd78 ffffffffc0567fd5 ffff8808491af440
May  7 23:26:16 roc-4r-scd212 kernel: [129694.169160] Call Trace:
May  7 23:26:16 roc-4r-scd212 kernel: [129694.169183]
[<ffffffffc0565e0b>] transport_wait_for_tasks+0xbb/0x150
[target_core_mod]
May  7 23:26:16 roc-4r-scd212 kernel: [129694.169199]
[<ffffffffc05bf418>] ?
iscsit_remove_cmd_from_response_queue+0xe8/0x120 [iscsi_target_mod]
May  7 23:26:16 roc-4r-scd212 kernel: [129694.169213]
[<ffffffffc0567fd5>] transport_generic_free_cmd+0xc5/0xe0
[target_core_mod]
May  7 23:26:16 roc-4r-scd212 kernel: [129694.169223]
[<ffffffffc05c0716>] iscsit_free_cmd+0x96/0x160 [iscsi_target_mod]
May  7 23:26:16 roc-4r-scd212 kernel: [129694.169233]
[<ffffffffc05c98dc>] iscsit_close_connection+0x47c/0x770
[iscsi_target_mod]
May  7 23:26:16 roc-4r-scd212 kernel: [129694.169242]
[<ffffffffc05b4c83>] iscsit_take_action_for_connection_exit+0x83/0x110
[iscsi_target_mod]
May  7 23:26:16 roc-4r-scd212 kernel: [129694.169251]
[<ffffffffc05c8690>] iscsi_target_tx_thread+0x120/0x1d0
[iscsi_target_mod]
May  7 23:26:16 roc-4r-scd212 kernel: [129694.169257]
[<ffffffff810c0630>] ? prepare_to_wait_event+0x100/0x100
May  7 23:26:16 roc-4r-scd212 kernel: [129694.169266]
[<ffffffffc05c8570>] ? iscsit_thread_get_cpumask+0xc0/0xc0
[iscsi_target_mod]
May  7 23:26:16 roc-4r-scd212 kernel: [129694.169270]
[<ffffffff8109cdc9>] kthread+0xc9/0xe0
May  7 23:26:16 roc-4r-scd212 kernel: [129694.169274]
[<ffffffff8109cd00>] ? flush_kthread_worker+0x90/0x90
May  7 23:26:16 roc-4r-scd212 kernel: [129694.169277]
[<ffffffff81803fe2>] ret_from_fork+0x42/0x70
May  7 23:26:16 roc-4r-scd212 kernel: [129694.169280]
[<ffffffff8109cd00>] ? flush_kthread_worker+0x90/0x90
May  7 23:26:16 roc-4r-scd212 kernel: [129694.169281] Code: 1f 80 00
00 00 00 5d c3 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 53
48 89 f3 0f 1f 44 00 00 66 83 07 02 48 89 df 57 9d <0f> 1f 44 00 00 5b
5d c3 0f 1f 44 00 00 b8 02 00 00 00 f0 66 0f

On Wed, May 6, 2015 at 12:21 PM, Robert Wood <rwood@xxxxxxxxxxxxxxxxxxx> wrote:
> One update: it appears that Vmware ESXi 5.5 U2 is sometimes
> incorrectly sensing that the LIO-ORG device is an SSD.  I wonder if
> that is causing issues with commands being sent?  I am testing
> untagging all LIO-ORG devices as SSD to see if the problem recurs.
>
>
>
> On Wed, May 6, 2015 at 11:17 AM, Robert Wood <rwood@xxxxxxxxxxxxxxxxxxx> wrote:
>> Good morning, we are continuing to receive:
>>
>> May  6 11:08:26 roc-4r-scd212 kernel: [71898.566185] ABORT_TASK:
>> ref_tag: 3847728 already complete, skipping
>> May  6 11:08:26 roc-4r-scd212 kernel: [71898.566187] ABORT_TASK:
>> Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 3847728
>> May  6 11:08:26 roc-4r-scd212 kernel: [71898.566191] Unable to locate
>> ITT: 0x003ab632 on CID: 0
>> May  6 11:08:26 roc-4r-scd212 kernel: [71898.566191] Unable to locate
>> RefTaskTag: 0x003ab632 on CID: 0.
>> May  6 11:08:26 roc-4r-scd212 kernel: [71898.566254] Unexpected ret:
>> -32 send data 48
>> May  6 11:08:30 roc-4r-scd212 kernel: [71902.397053] libceph: osd20
>> 10.80.3.25:6812 socket closed (con state OPEN)
--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux SCSI]     [Kernel Newbies]     [Linux SCSI Target Infrastructure]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Device Mapper]

  Powered by Linux