Hey Guys, We're seeing exactly the same behaviour using ESXi + LIO + DRBD, using pacemaker/corosync to control the cluster... Under periods of heavy load (typically during backups), we occasionally see warnings in the logs exactly as you've mentioned: > [ 3052.065353] ABORT_TASK: Found referenced iSCSI task_tag: 801219 [ > [ 3052.066370] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: > [ 3082.714529] ABORT_TASK: Found referenced iSCSI task_tag: > [ 3082.714532] ABORT_TASK: ref_tag: 801223 already complete, > skipping [ 3082.714533] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST > for ref_tag: 801223 [ 3082.714536] ABORT_TASK: Found referenced iSCSI > task_tag: 801222 [ 3082.714540] ABORT_TASK: Sending > TMR_FUNCTION_COMPLETE for ref_tag: 801222 We setup monitoring scripts that watch for these sorts of entries, followed by the inevitable LUN RESCAN that ESXi will perform when it can't talk to one of it's disks : [261204.802785] TARGET_CORE[iSCSI]: Detected NON_EXISTENT_LUN Access for 0x00000007 [261204.805443] TARGET_CORE[iSCSI]: Detected NON_EXISTENT_LUN Access for 0x00000008 [261204.806166] TARGET_CORE[iSCSI]: Detected NON_EXISTENT_LUN Access for 0x00000009 [261204.809172] TARGET_CORE[iSCSI]: Detected NON_EXISTENT_LUN Access for 0x0000000a etc... for the next 200 or so lines.. The only way we've found to deal with this is to migrate our primary storage to the second host in the cluster, unceremoniously killing the iSCSI stack on the initial host, and starting it on the second host. All this is REALLY accomplishing is resetting the connections, and letting ESXi reconnect. We're fairly heavily invested in this setup, and my question is, is there a way to set a flag somewhere, or tweak a setting of code, to allow LIO to violate the strict rules of the SCSI SPEC, to allow this setup to work? I'm going to HAVE to find a way around this very shortly, and I'd really rather the option not be "replace the in kernel iSCSI stack with TGT or SCST", because they allow that sort of thing. ...Steve... -----Original Message----- From: target-devel-owner@xxxxxxxxxxxxxxx [mailto:target-devel-owner@xxxxxxxxxxxxxxx] On Behalf Of Alex Gorbachev Sent: August 16, 2015 7:01 PM To: Martin Svec <martin.svec@xxxxxxxx> Cc: target-devel@xxxxxxxxxxxxxxx Subject: Re: ESXi + LIO + Ceph RBD problem Hi Martin, We tested and ran into similar scenarios. Based on your description, you are running RBD client on an OSD node, which is not recommended. As is not recommended to run OSD and MON on the same nodes. The ABORT_TASK errors have always been related to Ceph timeouts. Basically, my understanding is that an rdb times out, and after a brief grace period LIO and ESXi enter into a hailstorm of retry/reset commands, at which point IO pretty much won't resume. LIO stays strict to the SCSI spec where one data session is to be maintained, whereas my understanding is other solutions like TGT and SCST allow another session to start and bypass this issue. We have good results with three OSD nodes, 3 MONs, 2 LIO nodes with failover via Pacemaker and kernel 4.1. Best regards, Alex On Fri, Aug 14, 2015 at 11:28 AM, Martin Svec <martin.svec@xxxxxxxx> wrote: > Hello, > > I'm testing LIO iSCSI target on top of Ceph RBD as an iSCSI datastore > for VMware vSphere. When one of the Ceph OSD nodes is terminated > during heavy I/O (Storage VMotion to RBD), both initiator and target > side report ABORT_TASK-related errors and all I/O is stopped. It's necessary > to drop iSCSI connections and let ESXi reconnect to continue. > > ESXi warnings: > > WARNING: iscsi_vmk: iscsivmk_TaskMgmtIssue: vmhba35:CH:0 T:11 L:1 : > Task mgmt "Abort Task" with > itt=0x944d1 (refITT=0x944cd) timed out. > VMW_SATP_ALUA: satp_alua_issueCommandOnPath:651: Path > "vmhba35:C0:T11:L1" (UP) command 0xa3 failed with status Timeout. H:0x3 > D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0. > WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device > "naa.60014057056b5748fdbb7c16c3a0bd46" state in doubt; requested fast path > state update... > WARNING: iscsi_vmk: iscsivmk_TaskMgmtAbortCommands: vmhba35:CH:0 T:11 > L:1 : Abort task response indicates task with itt=0x944c7 has been > completed on the target but the task response has not arrived ... and > similar ones > > LIO warnings: > > [ 3052.065353] ABORT_TASK: Found referenced iSCSI task_tag: 801219 [ > 3052.066370] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: > 801219 [ 3082.714529] ABORT_TASK: Found referenced iSCSI task_tag: > 801223 [ 3082.714532] ABORT_TASK: ref_tag: 801223 already complete, > skipping [ 3082.714533] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST > for ref_tag: 801223 [ 3082.714536] ABORT_TASK: Found referenced iSCSI > task_tag: 801222 [ 3082.714540] ABORT_TASK: Sending > TMR_FUNCTION_COMPLETE for ref_tag: 801222 > > I guess the errors are related to the hardcoded 5000ms iSCSI timeout > in ESXi, where RBD driver requires longer time to recover when one of > the OSDs is lost. Is it possible? Does anybody have similar experience > with ESXi + LIO iSCSI + Ceph? I tried to tweak few Ceph heartbeat options > but I'm still at the beginning of the learning curve... > > My Ceph setup is very basic now: 3 virtual machines with Debian Jessie > and Ceph 0.80.7, one OSD and MON on each VM. The iSCSI LUN is > published from one of the nodes via dedicated network adapter to the > underlying vSphere infrastructure. > > Thank you. > > Martin > > -- > To unsubscribe from this list: send the line "unsubscribe > target-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html
Attachment:
smime.p7s
Description: S/MIME cryptographic signature