Re: slow using ISCSI - Help-me

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 02/13/2020 08:52 PM, Gesiel Galvão Bernardes wrote:
> Hi
> 
> Em dom., 9 de fev. de 2020 às 18:27, Mike Christie <mchristi@xxxxxxxxxx
> <mailto:mchristi@xxxxxxxxxx>> escreveu:
> 
>     On 02/08/2020 11:34 PM, Gesiel Galvão Bernardes wrote:
>     > Hi,
>     >
>     > Em qui., 6 de fev. de 2020 às 18:56, Mike Christie
>     <mchristi@xxxxxxxxxx <mailto:mchristi@xxxxxxxxxx>
>     > <mailto:mchristi@xxxxxxxxxx <mailto:mchristi@xxxxxxxxxx>>> escreveu:
>     >
>     >     On 02/05/2020 07:03 AM, Gesiel Galvão Bernardes wrote:
>     >     > Em dom., 2 de fev. de 2020 às 00:37, Gesiel Galvão Bernardes
>     >     > <gesiel.bernardes@xxxxxxxxx
>     <mailto:gesiel.bernardes@xxxxxxxxx>
>     <mailto:gesiel.bernardes@xxxxxxxxx <mailto:gesiel.bernardes@xxxxxxxxx>>
>     >     <mailto:gesiel.bernardes@xxxxxxxxx
>     <mailto:gesiel.bernardes@xxxxxxxxx>
>     >     <mailto:gesiel.bernardes@xxxxxxxxx
>     <mailto:gesiel.bernardes@xxxxxxxxx>>>> escreveu:
>     >     >
>     >     >     Hi,
>     >     >
>     >     >     Just now was possible continue this. Below is the
>     information
>     >     >     required. Thanks advan
>     >
>     >
>     >     Hey, sorry for the late reply. I just back from PTO.
>     >
>     >     >
>     >     >     esxcli storage nmp device list -d
>     >     naa.6001405ba48e0b99e4c418ca13506c8e
>     >     >     naa.6001405ba48e0b99e4c418ca13506c8e
>     >     >        Device Display Name: LIO-ORG iSCSI Disk
>     >     >     (naa.6001405ba48e0b99e4c418ca13506c8e)
>     >     >        Storage Array Type: VMW_SATP_ALUA
>     >     >        Storage Array Type Device Config: {implicit_support=on;
>     >     >     explicit_support=off; explicit_allow=on; alua_followover=on;
>     >     >     action_OnRetryErrors=on; {TPG_id=1,TPG_state=ANO}}
>     >     >        Path Selection Policy: VMW_PSP_MRU
>     >     >        Path Selection Policy Device Config: Current
>     >     Path=vmhba68:C0:T0:L0
>     >     >        Path Selection Policy Device Custom Config:
>     >     >        Working Paths: vmhba68:C0:T0:L0
>     >     >        Is USB: false
>     >
>     >     ........
>     >
>     >     >     Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x2 0x4 0xa.
>     >     Act:FAILOVER
>     >
>     >
>     >     Are you sure you are using tcmu-runner 1.4? Is that the actual
>     daemon
>     >     reversion running? Did you by any chance install the 1.4 rpm,
>     but you/it
>     >     did not restart the daemon? The error code above is returned
>     in 1.3 and
>     >     earlier.
>     >
>     >     You are probably hitting a combo of 2 issues.
>     >
>     >     We had only listed ESX 6.5 in the docs you probably saw, and
>     in 6.7 the
>     >     value of action_OnRetryErrors defaulted to on instead of off.
>     You should
>     >     set this back to off.
>     >
>     >     You should also upgrade to the current version of tcmu-runner
>     1.5.x. It
>     >     should fix the issue you are hitting, so non IO commands like
>     inquiry,
>     >     RTPG, etc are executed while failing over/back, so you would
>     not hit the
>     >     problem where path initialization and path testing IO is
>     failed causing
>     >     the path to marked as failed.
>     >
>     >
>     > I updated tcmu-runner to 1.5.2, and change action_OnRetryErrors to
>     off,
>     > but the problem continue 😭 
>     > 
>     > Attached is vmkernel.log.
>     >
> 
> 
>     When you stopped the iscsi gw at around 2020-02-09T01:51:25.820Z, how
>     many paths did your device have? Did:
> 
>     esxcli storage nmp path list -d your_device
> 
>     report only one path? Did
> 
>     esxcli iscsi session connection list
> 
>     show a iscsi connection to each gw?
> 
> Hmmm, I believe the problem may be here. I verified that I was listing
> only one GW for each path. So I ran a "rescan HBA" on VMware on both
> ESX, now one of them lists the 3 (I added one more) gateways, but an ESX
> host with the same configuration continues to list only one gateway. See
> the different outputs:
> 
>  [root@tcnvh7:~] esxcli iscsi session connection list
> vmhba68,iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw,00023d000001,0
>    Adapter: vmhba68
>    Target: iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw
>    ISID: 00023d000001
>    CID: 0
>    DataDigest: NONE
>    HeaderDigest: NONE
>    IFMarker: false
>    IFMarkerInterval: 0
>    MaxRecvDataSegmentLength: 131072
>    MaxTransmitDataSegmentLength: 262144
>    OFMarker: false
>    OFMarkerInterval: 0
>    ConnectionAddress: 192.168.201.1
>    RemoteAddress: 192.168.201.1
>    LocalAddress: 192.168.201.107
>    SessionCreateTime: 01/19/20 00:11:25
>    ConnectionCreateTime: 01/19/20 00:11:25
>    ConnectionStartTime: 02/13/20 23:03:10
>    State: logged_in
> 
> vmhba68,iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw,00023d000002,0
>    Adapter: vmhba68
>    Target: iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw
>    ISID: 00023d000002
>    CID: 0
>    DataDigest: NONE
>    HeaderDigest: NONE
>    IFMarker: false
>    IFMarkerInterval: 0
>    MaxRecvDataSegmentLength: 131072
>    MaxTransmitDataSegmentLength: 262144
>    OFMarker: false
>    OFMarkerInterval: 0
>    ConnectionAddress: 192.168.201.2
>    RemoteAddress: 192.168.201.2
>    LocalAddress: 192.168.201.107
>    SessionCreateTime: 02/13/20 23:09:16
>    ConnectionCreateTime: 02/13/20 23:09:16
>    ConnectionStartTime: 02/13/20 23:09:16
>    State: logged_in
> 
> vmhba68,iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw,00023d000003,0
>    Adapter: vmhba68
>    Target: iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw
>    ISID: 00023d000003
>    CID: 0
>    DataDigest: NONE
>    HeaderDigest: NONE
>    IFMarker: false
>    IFMarkerInterval: 0
>    MaxRecvDataSegmentLength: 131072
>    MaxTransmitDataSegmentLength: 262144
>    OFMarker: false
>    OFMarkerInterval: 0
>    ConnectionAddress: 192.168.201.3
>    RemoteAddress: 192.168.201.3
>    LocalAddress: 192.168.201.107
>    SessionCreateTime: 02/13/20 23:09:16
>    ConnectionCreateTime: 02/13/20 23:09:16
>    ConnectionStartTime: 02/13/20 23:09:16
>    State: logged_in
> 
> =====
> [root@tcnvh8:~] esxcli iscsi session connection list
> vmhba68,iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw,00023d000001,0
>    Adapter: vmhba68
>    Target: iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw
>    ISID: 00023d000001
>    CID: 0
>    DataDigest: NONE
>    HeaderDigest: NONE
>    IFMarker: false
>    IFMarkerInterval: 0
>    MaxRecvDataSegmentLength: 131072
>    MaxTransmitDataSegmentLength: 262144
>    OFMarker: false
>    OFMarkerInterval: 0
>    ConnectionAddress: 192.168.201.1
>    RemoteAddress: 192.168.201.1
>    LocalAddress: 192.168.201.108
>    SessionCreateTime: 01/12/20 02:53:53
>    ConnectionCreateTime: 01/12/20 02:53:53
>    ConnectionStartTime: 02/13/20 23:06:40
>    State: logged_in
> 
> Is that the problem? Any ideas on how to proceed from here?
> 

Yes. Normally, you would have the connection already created, and when
one path/gateway goes down, then the multipath layer will switch to
another path. When the path/gateway comes back up, the initiator side's
iscsi layer will reconnect automatically and the multipath layer will
re-setup the path structure, so it can failback if its a higher priority
path or failover later if other paths go down.

Something happened with the automatic path connection process on that
node. We know it works for that one gateway you brought up/down. For the
other gateways I would check:

1. Check that all target portals are being discovered. In the GUI screen
you entered in the discovery address, you should also see a list of all
target portals that were found in the static section. Do you only see 1
portal?

See here:

https://docs.vmware.com/en/VMware-vSphere/6.7/com.vmware.vsphere.storage.doc/GUID-66215AF3-2D81-4D1F-92D4-B9623FC1CB0E.html

2. If you see all the portals then when you hit the rescan HBA button,
do you see any errors on the target side in /var/log/messages? Maybe
something about CHAP/login/auth errors?

What about in the /var/log/vmkernel.log on the initiator side? Any iscsi
errors?
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux