Re: slow using ISCSI - Help-me

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi

Em dom., 9 de fev. de 2020 às 18:27, Mike Christie <mchristi@xxxxxxxxxx>
escreveu:

> On 02/08/2020 11:34 PM, Gesiel Galvão Bernardes wrote:
> > Hi,
> >
> > Em qui., 6 de fev. de 2020 às 18:56, Mike Christie <mchristi@xxxxxxxxxx
> > <mailto:mchristi@xxxxxxxxxx>> escreveu:
> >
> >     On 02/05/2020 07:03 AM, Gesiel Galvão Bernardes wrote:
> >     > Em dom., 2 de fev. de 2020 às 00:37, Gesiel Galvão Bernardes
> >     > <gesiel.bernardes@xxxxxxxxx <mailto:gesiel.bernardes@xxxxxxxxx>
> >     <mailto:gesiel.bernardes@xxxxxxxxx
> >     <mailto:gesiel.bernardes@xxxxxxxxx>>> escreveu:
> >     >
> >     >     Hi,
> >     >
> >     >     Just now was possible continue this. Below is the information
> >     >     required. Thanks advan
> >
> >
> >     Hey, sorry for the late reply. I just back from PTO.
> >
> >     >
> >     >     esxcli storage nmp device list -d
> >     naa.6001405ba48e0b99e4c418ca13506c8e
> >     >     naa.6001405ba48e0b99e4c418ca13506c8e
> >     >        Device Display Name: LIO-ORG iSCSI Disk
> >     >     (naa.6001405ba48e0b99e4c418ca13506c8e)
> >     >        Storage Array Type: VMW_SATP_ALUA
> >     >        Storage Array Type Device Config: {implicit_support=on;
> >     >     explicit_support=off; explicit_allow=on; alua_followover=on;
> >     >     action_OnRetryErrors=on; {TPG_id=1,TPG_state=ANO}}
> >     >        Path Selection Policy: VMW_PSP_MRU
> >     >        Path Selection Policy Device Config: Current
> >     Path=vmhba68:C0:T0:L0
> >     >        Path Selection Policy Device Custom Config:
> >     >        Working Paths: vmhba68:C0:T0:L0
> >     >        Is USB: false
> >
> >     ........
> >
> >     >     Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x2 0x4 0xa.
> >     Act:FAILOVER
> >
> >
> >     Are you sure you are using tcmu-runner 1.4? Is that the actual daemon
> >     reversion running? Did you by any chance install the 1.4 rpm, but
> you/it
> >     did not restart the daemon? The error code above is returned in 1.3
> and
> >     earlier.
> >
> >     You are probably hitting a combo of 2 issues.
> >
> >     We had only listed ESX 6.5 in the docs you probably saw, and in 6.7
> the
> >     value of action_OnRetryErrors defaulted to on instead of off. You
> should
> >     set this back to off.
> >
> >     You should also upgrade to the current version of tcmu-runner 1.5.x.
> It
> >     should fix the issue you are hitting, so non IO commands like
> inquiry,
> >     RTPG, etc are executed while failing over/back, so you would not hit
> the
> >     problem where path initialization and path testing IO is failed
> causing
> >     the path to marked as failed.
> >
> >
> > I updated tcmu-runner to 1.5.2, and change action_OnRetryErrors to off,
> > but the problem continue 😭
> >
> > Attached is vmkernel.log.
> >
>
>
> When you stopped the iscsi gw at around 2020-02-09T01:51:25.820Z, how
> many paths did your device have? Did:
>
> esxcli storage nmp path list -d your_device
>
> report only one path? Did
>
> esxcli iscsi session connection list
>
> show a iscsi connection to each gw?
>
> Hmmm, I believe the problem may be here. I verified that I was listing
only one GW for each path. So I ran a "rescan HBA" on VMware on both ESX,
now one of them lists the 3 (I added one more) gateways, but an ESX host
with the same configuration continues to list only one gateway. See the
different outputs:

 [root@tcnvh7:~] esxcli iscsi session connection list
vmhba68,iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw,00023d000001,0
   Adapter: vmhba68
   Target: iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw
   ISID: 00023d000001
   CID: 0
   DataDigest: NONE
   HeaderDigest: NONE
   IFMarker: false
   IFMarkerInterval: 0
   MaxRecvDataSegmentLength: 131072
   MaxTransmitDataSegmentLength: 262144
   OFMarker: false
   OFMarkerInterval: 0
   ConnectionAddress: 192.168.201.1
   RemoteAddress: 192.168.201.1
   LocalAddress: 192.168.201.107
   SessionCreateTime: 01/19/20 00:11:25
   ConnectionCreateTime: 01/19/20 00:11:25
   ConnectionStartTime: 02/13/20 23:03:10
   State: logged_in

vmhba68,iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw,00023d000002,0
   Adapter: vmhba68
   Target: iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw
   ISID: 00023d000002
   CID: 0
   DataDigest: NONE
   HeaderDigest: NONE
   IFMarker: false
   IFMarkerInterval: 0
   MaxRecvDataSegmentLength: 131072
   MaxTransmitDataSegmentLength: 262144
   OFMarker: false
   OFMarkerInterval: 0
   ConnectionAddress: 192.168.201.2
   RemoteAddress: 192.168.201.2
   LocalAddress: 192.168.201.107
   SessionCreateTime: 02/13/20 23:09:16
   ConnectionCreateTime: 02/13/20 23:09:16
   ConnectionStartTime: 02/13/20 23:09:16
   State: logged_in

vmhba68,iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw,00023d000003,0
   Adapter: vmhba68
   Target: iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw
   ISID: 00023d000003
   CID: 0
   DataDigest: NONE
   HeaderDigest: NONE
   IFMarker: false
   IFMarkerInterval: 0
   MaxRecvDataSegmentLength: 131072
   MaxTransmitDataSegmentLength: 262144
   OFMarker: false
   OFMarkerInterval: 0
   ConnectionAddress: 192.168.201.3
   RemoteAddress: 192.168.201.3
   LocalAddress: 192.168.201.107
   SessionCreateTime: 02/13/20 23:09:16
   ConnectionCreateTime: 02/13/20 23:09:16
   ConnectionStartTime: 02/13/20 23:09:16
   State: logged_in

=====
[root@tcnvh8:~] esxcli iscsi session connection list
vmhba68,iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw,00023d000001,0
   Adapter: vmhba68
   Target: iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw
   ISID: 00023d000001
   CID: 0
   DataDigest: NONE
   HeaderDigest: NONE
   IFMarker: false
   IFMarkerInterval: 0
   MaxRecvDataSegmentLength: 131072
   MaxTransmitDataSegmentLength: 262144
   OFMarker: false
   OFMarkerInterval: 0
   ConnectionAddress: 192.168.201.1
   RemoteAddress: 192.168.201.1
   LocalAddress: 192.168.201.108
   SessionCreateTime: 01/12/20 02:53:53
   ConnectionCreateTime: 01/12/20 02:53:53
   ConnectionStartTime: 02/13/20 23:06:40
   State: logged_in

Is that the problem? Any ideas on how to proceed from here?

 The logs look like when you brought the gw down, we lost the only path
> we had. We then went into all paths down, so IO could not execute. It
> looks like the gw was brought back up at the end of the log and the path
> seem to have got added back.
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux