Re: Connectivity problems with ISCSI target and ESXi server(s)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2014-06-11 at 18:34 -0700, Nicholas A. Bellinger wrote:
> Hi Harry,
> 
> Re-adding the linux-scsi CC, minus your log link.
> 
> On Thu, 2014-06-12 at 01:29 +0300, Charalampos Pournaris wrote:
> > Hi Nicholas,
> > 
> > I will apply the patches as soon as possible, thank you for your prompt 
> > response. It's a production environment and thus I cannot recompile the 
> > kernel on demand, unfortunately.
> > In the meantime, I've hit again a kernel panic in one of the iSCSI VMs. 
> > I've collected the logs from that machine. Would you mind taking a look 
> > if you find some time, it'd be interesting to find out whether it's a 
> > problem with a driver or if it derives from iSCSI code? Again, the 
> > kernel didn't have the patches provided in your last reply at the time 
> > of the failure.
> > 
> > Here's the link:
> 
> <SNIP>
> 
> > sha1sum: 557f45b17316b4d4651dd010ea85b00a6c034e64
> > 
> 
> So the same thing is happening as before:
> 
> Jun 11 16:06:35 sof-24381-iscsi-vm kernel: [795215.637642] ABORT_TASK: Found referenced iSCSI task_tag: 19042955
> Jun 11 16:06:35 sof-24381-iscsi-vm kernel: [795215.637647] wait_for_tasks: Stopping ffff8801ed4e8350 ITT: 0x0122928b i_state: 6, t_state: 5, CMD_T_STOP
> 
> An ABORT_TASK is received for a I/O that's in TRANSPORT_PROCESSING state.
> 
> Jun 11 16:09:21 sof-24381-iscsi-vm kernel: [795382.212926] INFO: task kworker/u4:0:3539 blocked for more than 120 seconds.
> Jun 11 16:09:21 sof-24381-iscsi-vm kernel: [795382.213862]       Not tainted 3.14.4-ddebug-on #1
> Jun 11 16:09:21 sof-24381-iscsi-vm kernel: [795382.214580] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Jun 11 16:09:21 sof-24381-iscsi-vm kernel: [795382.215882] kworker/u4:0    D ffff880236d46468     0  3539      2 0x00000000
> Jun 11 16:09:21 sof-24381-iscsi-vm kernel: [795382.215931] Workqueue: tmr-iblock target_tmr_work [target_core_mod]
> Jun 11 16:09:21 sof-24381-iscsi-vm kernel: [795382.215934]  ffff880236d46050 0000000000000046 0000000000014380 0000000000014380
> Jun 11 16:09:21 sof-24381-iscsi-vm kernel: [795382.215936]  ffff8800bafa1fd8 ffff880236d46050 ffff8801ed4e8480 ffff8800bafa1d40
> Jun 11 16:09:21 sof-24381-iscsi-vm kernel: [795382.215938]  ffff8801ed4e8478 ffff880236d46050 0000000000000005 000000000122928b
> Jun 11 16:09:21 sof-24381-iscsi-vm kernel: [795382.215940] Call Trace:
> Jun 11 16:09:21 sof-24381-iscsi-vm kernel: [795382.215968]  [<ffffffff814c60c9>] ? schedule_timeout+0x209/0x2a0
> Jun 11 16:09:21 sof-24381-iscsi-vm kernel: [795382.215980]  [<ffffffff812a34c9>] ? __dynamic_pr_debug+0x79/0x80
> Jun 11 16:09:21 sof-24381-iscsi-vm kernel: [795382.215984]  [<ffffffff814c7650>] ? wait_for_completion+0xa0/0x110
> Jun 11 16:09:21 sof-24381-iscsi-vm kernel: [795382.215999]  [<ffffffff8108f680>] ? wake_up_state+0x10/0x10
> Jun 11 16:09:21 sof-24381-iscsi-vm kernel: [795382.216004]  [<ffffffffa03d2950>] ? transport_wait_for_tasks+0x60/0x130 [target_core_mod]
> Jun 11 16:09:21 sof-24381-iscsi-vm kernel: [795382.216009]  [<ffffffffa03cfe4e>] ? core_tmr_abort_task+0x14e/0x220 [target_core_mod]
> Jun 11 16:09:21 sof-24381-iscsi-vm kernel: [795382.216013]  [<ffffffffa03d2af7>] ? target_tmr_work+0xa7/0xc0 [target_core_mod]
> Jun 11 16:09:21 sof-24381-iscsi-vm kernel: [795382.216024]  [<ffffffff81079c2d>] ? process_one_work+0x16d/0x420
> Jun 11 16:09:21 sof-24381-iscsi-vm kernel: [795382.216027]  [<ffffffff8107a7e6>] ? worker_thread+0x116/0x3b0
> Jun 11 16:09:21 sof-24381-iscsi-vm kernel: [795382.216029]  [<ffffffff8107a6d0>] ? rescuer_thread+0x330/0x330
> Jun 11 16:09:21 sof-24381-iscsi-vm kernel: [795382.216032]  [<ffffffff81080a71>] ? kthread+0xc1/0xe0
> Jun 11 16:09:21 sof-24381-iscsi-vm kernel: [795382.216034]  [<ffffffff810809b0>] ? kthread_create_on_node+0x180/0x180
> Jun 11 16:09:21 sof-24381-iscsi-vm kernel: [795382.216039]  [<ffffffff814d1d8c>] ? ret_from_fork+0x7c/0xb0
> Jun 11 16:09:21 sof-24381-iscsi-vm kernel: [795382.216041]  [<ffffffff810809b0>] ? kthread_create_on_node+0x180/0x180
> 
> And the ABORT_TASK waits for this I/O to come back, but according to the
> log it never completes from the backend + underlying LLD, which
> eventually causes the hung task logic kicks in.  The last time-stamp in
> the log is 16:18:17, which means between the ABORT_TASK at 16:06:35 +
> 16:18:17, ~12 minutes have elapsed with no response from the underlying
> backend storage for the I/O in question.
> 
> The three patches mentioned previously will address the case where an
> ABORT_TASK hangs indefinitely when an session reset alow occurs, but
> AFAICT no session reset is occurring in your logs, so your not hitting
> this specific bug.
> 
> So based on the evidence, it still appears that your backend is not
> (ever..?) returning outstanding I/Os to IBLOCK, which means there is
> some type of bug in the backend storage outside of the target layer.
> 
> At some point a real OOPs does occur, and not just the hung_task
> warnings above.  This OOPs is not present in the kern.log out, most
> likely because it involved the underlying storage driver.
> 
> That said, I'd strongly recommend attaching a virtual serial console to
> the vmware guest, and figure out where the real OOPs is occurring by
> redirecting console output with console=ttyS0.
> 
> My guess at this point is that it's something related to the VMware
> PVSCSI driver.
> 

And yes, I'd still recommend applying the three patches regardless.

;)

--nab

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux SCSI]     [Kernel Newbies]     [Linux SCSI Target Infrastructure]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Device Mapper]

  Powered by Linux