Re: slow using ISCSI - Help-me

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Just now was possible continue this. Below is the information required.
Thanks advance,

Gesiel

Em seg., 20 de jan. de 2020 às 15:06, Mike Christie <mchristi@xxxxxxxxxx>
escreveu:

> On 01/20/2020 10:29 AM, Gesiel Galvão Bernardes wrote:
> > Hi,
> >
> > Only now have I been able to act on this problem. My environment is
> > relatively simple: I have two ESXi 6.7 hosts, connected to two ISCSI
> > gateways, using two RBD images.
> >
> > When this mail started, the workaround was to keep only one ISCSI
> > gateway connected, so it works normally. After the answer that received
> > here that the problem could be in the configuration of VMWare, I
> > reviewed the configuration of both (they are exactly according to Ceph
> > documentation), and rebooted both.
> >
>
> Can you give me some basic info.
>
> The output of:
> # "gwcli ls"
>


 o- /
.........................................................................................................
[...]
  o- cluster
.........................................................................................
[Clusters: 1]
  | o- ceph
............................................................................................
[HEALTH_OK]
  |   o- pools
..........................................................................................
[Pools: 7]
  |   | o- data_ecpool ................................. [(x3), Commit:
0.00Y/25836962M (0%), Used: 25011763551408b]
  |   | o- pool1 ....................................... [(x2), Commit:
5.0T/38755444M (13%), Used: 16162480333070b]
  |   | o- pool2 ....................................... [(x3), Commit:
6.0T/25836962M (24%), Used: 24261305669809b]
  |   | o- pool3 ........................................ [(4+0), Commit:
0.00Y/58133164M (0%), Used: 719866819603b]
  |   | o- pool_cache ...................................... [(x3), Commit:
0.00Y/706164M (0%), Used: 422549509342b]
  |   | o- poolfs ................................................ [(x3),
Commit: 0.00Y/25836962M (0%), Used: 0.00Y]
  |   | o- rbd ................................................... [(x3),
Commit: 0.00Y/25836962M (0%), Used: 5737b]
  |   o- topology
...............................................................................
[OSDs: 61,MONs: 2]
  o- disks
.......................................................................................
[11.0T, Disks: 3]
  | o- pool1
........................................................................................
[pool1 (5.0T)]
  | | o- vmware_iscsi1
................................................................
[pool1/vmware_iscsi1 (5.0T)]
  | o- pool2
........................................................................................
[pool2 (6.0T)]
  |   o- iscsi-test
......................................................................
[pool2/iscsi-test (1.0T)]
  |   o- vmware_iscsi2
................................................................
[pool2/vmware_iscsi2 (5.0T)]
  o- iscsi-targets
...............................................................
[DiscoveryAuth: None, Targets: 1]
    o- iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw
......................................... [Auth: None, Gateways: 2]
      o- disks
..........................................................................................
[Disks: 3]
      | o- pool1/vmware_iscsi1
................................................................ [Owner:
ceph-iscsi2]
      | o- pool2/iscsi-test
................................................................... [Owner:
ceph-iscsi2]
      | o- pool2/vmware_iscsi2
................................................................ [Owner:
ceph-iscsi1]
      o- gateways
............................................................................
[Up: 2/2, Portals: 2]
      | o- ceph-iscsi1
........................................................................
[192.168.201.1 (UP)]
      | o- ceph-iscsi2
........................................................................
[192.168.201.2 (UP)]
      o- host-groups
..................................................................................
[Groups : 0]
      o- hosts
.......................................................................
[Auth: ACL_ENABLED, Hosts: 2]
        o- iqn.1994-05.com.redhat:rh7-client ..............................
[LOGGED-IN, Auth: CHAP, Disks: 2(10.0T)]
        | o- lun 0 .................................................
[pool1/vmware_iscsi1(5.0T), Owner: ceph-iscsi2]
        | o- lun 1 .................................................
[pool2/vmware_iscsi2(5.0T), Owner: ceph-iscsi1]
        o- iqn.1994-05.com.redhat:tcnvh8 ..................................
[LOGGED-IN, Auth: CHAP, Disks: 2(10.0T)]
          o- lun 0 .................................................
[pool1/vmware_iscsi1(5.0T), Owner: ceph-iscsi2]
          o- lun 1 .................................................
[pool2/vmware_iscsi2(5.0T), Owner: ceph-iscsi1]
>
> from one of the iscsi nodes, and give me the output of:
>
> # targetcli ls
>

ceph-iscsi1 ~]# targetcli ls
Warning: Could not load preferences file /root/.targetcli/prefs.bin.
o- /
.........................................................................................................
[...]
  o- backstores
..............................................................................................
[...]
  | o- block
..................................................................................
[Storage Objects: 0]
  | o- fileio
.................................................................................
[Storage Objects: 0]
  | o- pscsi
..................................................................................
[Storage Objects: 0]
  | o- ramdisk
................................................................................
[Storage Objects: 0]
  | o- user:glfs
..............................................................................
[Storage Objects: 0]
  | o- user:qcow
..............................................................................
[Storage Objects: 0]
  | o- user:rbd
...............................................................................
[Storage Objects: 3]
  | | o- pool1.vmware_iscsi1 ............................
[pool1/vmware_iscsi1;osd_op_timeout=30 (5.0TiB) activated]
  | | | o- alua
...................................................................................
[ALUA Groups: 3]
  | | |   o- ano1
............................................................... [ALUA
state: Active/non-optimized]
  | | |   o- ao
..................................................................... [ALUA
state: Active/optimized]
  | | |   o- default_tg_pt_gp
....................................................... [ALUA state:
Active/optimized]
  | | o- pool2.iscsi-test ..................................
[pool2/iscsi-test;osd_op_timeout=30 (1.0TiB) activated]
  | | | o- alua
...................................................................................
[ALUA Groups: 3]
  | | |   o- ano1
............................................................... [ALUA
state: Active/non-optimized]
  | | |   o- ao
..................................................................... [ALUA
state: Active/optimized]
  | | |   o- default_tg_pt_gp
....................................................... [ALUA state:
Active/optimized]
  | | o- pool2.vmware_iscsi2 ............................
[pool2/vmware_iscsi2;osd_op_timeout=30 (5.0TiB) activated]
  | |   o- alua
...................................................................................
[ALUA Groups: 3]
  | |     o- ano2
............................................................... [ALUA
state: Active/non-optimized]
  | |     o- ao
..................................................................... [ALUA
state: Active/optimized]
  | |     o- default_tg_pt_gp
....................................................... [ALUA state:
Active/optimized]
  | o- user:zbc
...............................................................................
[Storage Objects: 0]
  o- iscsi
............................................................................................
[Targets: 1]
  | o- iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw
......................................................... [TPGs: 2]
  |   o- tpg1
..........................................................................
[no-gen-acls, auth per-acl]
  |   | o- acls
..........................................................................................
[ACLs: 2]
  |   | | o- iqn.1994-05.com.redhat:rh7-client
........................................ [1-way auth, Mapped LUNs: 2]
  |   | | | o- mapped_lun0
.................................................... [lun0
user/pool1.vmware_iscsi1 (rw)]
  |   | | | o- mapped_lun1
.................................................... [lun2
user/pool2.vmware_iscsi2 (rw)]
  |   | | o- iqn.1994-05.com.redhat:tcnvh8
............................................ [1-way auth, Mapped LUNs: 2]
  |   | |   o- mapped_lun0
.................................................... [lun0
user/pool1.vmware_iscsi1 (rw)]
  |   | |   o- mapped_lun1
.................................................... [lun2
user/pool2.vmware_iscsi2 (rw)]
  |   | o- luns
..........................................................................................
[LUNs: 3]
  |   | | o- lun0
................................................................
[user/pool1.vmware_iscsi1 (ano1)]
  |   | | o- lun1
...................................................................
[user/pool2.iscsi-test (ano1)]
  |   | | o- lun2
..................................................................
[user/pool2.vmware_iscsi2 (ao)]
  |   | o- portals
....................................................................................
[Portals: 1]
  |   |   o- 192.168.201.1:3260
...............................................................................
[OK]
  |   o- tpg2
...........................................................................................
[disabled]
  |     o- acls
..........................................................................................
[ACLs: 0]
  |     o- luns
..........................................................................................
[LUNs: 3]
  |     | o- lun0
..................................................................
[user/pool1.vmware_iscsi1 (ao)]
  |     | o- lun1
.....................................................................
[user/pool2.iscsi-test (ao)]
  |     | o- lun2
................................................................
[user/pool2.vmware_iscsi2 (ano2)]
  |     o- portals
....................................................................................
[Portals: 1]
  |       o- 192.168.201.2:3260
...............................................................................
[OK]
  o- loopback
.........................................................................................
[Targets: 0]


ceph-iscsi2 ~]# targetcli ls
Warning: Could not load preferences file /root/.targetcli/prefs.bin.
o- /
.........................................................................................................
[...]
  o- backstores
..............................................................................................
[...]
  | o- block
..................................................................................
[Storage Objects: 0]
  | o- fileio
.................................................................................
[Storage Objects: 0]
  | o- pscsi
..................................................................................
[Storage Objects: 0]
  | o- ramdisk
................................................................................
[Storage Objects: 0]
  | o- user:glfs
..............................................................................
[Storage Objects: 0]
  | o- user:qcow
..............................................................................
[Storage Objects: 0]
  | o- user:rbd
...............................................................................
[Storage Objects: 3]
  | | o- pool1.vmware_iscsi1 ............................
[pool1/vmware_iscsi1;osd_op_timeout=30 (5.0TiB) activated]
  | | | o- alua
...................................................................................
[ALUA Groups: 3]
  | | |   o- ano1
............................................................... [ALUA
state: Active/non-optimized]
  | | |   o- ao
..................................................................... [ALUA
state: Active/optimized]
  | | |   o- default_tg_pt_gp
....................................................... [ALUA state:
Active/optimized]
  | | o- pool2.iscsi-test ..................................
[pool2/iscsi-test;osd_op_timeout=30 (1.0TiB) activated]
  | | | o- alua
...................................................................................
[ALUA Groups: 3]
  | | |   o- ano1
............................................................... [ALUA
state: Active/non-optimized]
  | | |   o- ao
..................................................................... [ALUA
state: Active/optimized]
  | | |   o- default_tg_pt_gp
....................................................... [ALUA state:
Active/optimized]
  | | o- pool2.vmware_iscsi2 ............................
[pool2/vmware_iscsi2;osd_op_timeout=30 (5.0TiB) activated]
  | |   o- alua
...................................................................................
[ALUA Groups: 3]
  | |     o- ano2
............................................................... [ALUA
state: Active/non-optimized]
  | |     o- ao
..................................................................... [ALUA
state: Active/optimized]
  | |     o- default_tg_pt_gp
....................................................... [ALUA state:
Active/optimized]
  | o- user:zbc
...............................................................................
[Storage Objects: 0]
  o- iscsi
............................................................................................
[Targets: 1]
  | o- iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw
......................................................... [TPGs: 2]
  |   o- tpg1
...........................................................................................
[disabled]
  |   | o- acls
..........................................................................................
[ACLs: 0]
  |   | o- luns
..........................................................................................
[LUNs: 3]
  |   | | o- lun0
................................................................
[user/pool1.vmware_iscsi1 (ano1)]
  |   | | o- lun1
...................................................................
[user/pool2.iscsi-test (ano1)]
  |   | | o- lun2
..................................................................
[user/pool2.vmware_iscsi2 (ao)]
  |   | o- portals
....................................................................................
[Portals: 1]
  |   |   o- 192.168.201.1:3260
...............................................................................
[OK]
  |   o- tpg2
..........................................................................
[no-gen-acls, auth per-acl]
  |     o- acls
..........................................................................................
[ACLs: 2]
  |     | o- iqn.1994-05.com.redhat:rh7-client
........................................ [1-way auth, Mapped LUNs: 2]
  |     | | o- mapped_lun0
.................................................... [lun0
user/pool1.vmware_iscsi1 (rw)]
  |     | | o- mapped_lun1
.................................................... [lun2
user/pool2.vmware_iscsi2 (rw)]
  |     | o- iqn.1994-05.com.redhat:tcnvh8
............................................ [1-way auth, Mapped LUNs: 2]
  |     |   o- mapped_lun0
.................................................... [lun0
user/pool1.vmware_iscsi1 (rw)]
  |     |   o- mapped_lun1
.................................................... [lun2
user/pool2.vmware_iscsi2 (rw)]
  |     o- luns
..........................................................................................
[LUNs: 3]
  |     | o- lun0
..................................................................
[user/pool1.vmware_iscsi1 (ao)]
  |     | o- lun1
.....................................................................
[user/pool2.iscsi-test (ao)]
  |     | o- lun2
................................................................
[user/pool2.vmware_iscsi2 (ano2)]
  |     o- portals
....................................................................................
[Portals: 1]
  |       o- 192.168.201.2:3260
...............................................................................
[OK]
  o- loopback
.........................................................................................
[Targets: 0]



> from both iscsi nodes.
>
> The ceph-iscsi and tcmu-runner versions and did you build them yourself,
> get them from the ceph repos or get it from a distro repo.
>

I use in both gateways:  ceph-iscsi-3.3-1.el7.noarch from ceph-iscsi repo (
http://download.ceph.com/ceph-iscsi/3/rpm/el7/noarch)
         tcmu-runner-1.4.0-0.1.51.geef5115.el7.x86_64 from
https://3.chacra.ceph.com/r/tcmu-runner/master/eef511565078fb4e2ed52caaff16e6c7e75ed6c3/centos/7/flavors/default/x86_64/tcmu-runner-1.4.0-0.1.51.geef5115.el7.x86_64.rpm


>
> > It turns out, now the gateway "ceph-iscsi1" is working. When I turn on
> > "ceph-iscsi2" it appears as "Active / Not optimized" for the two RBD
> > images (before was an Active / Optimized for each image), and if I turn
> > off "ceph-iscsi1" it (ceph-iscs2) remains as "Active / No optimized "and
> > images are unavailable.
>
> On the ESX side can you give me the output of:
>
> esxcli storage nmp path list -d disk_id
>
 esxcli storage nmp path list -d naa.6001405ba48e0b99e4c418ca13506c8e
iqn.1994-05.com.redhat:rh7-client-00023d000001,iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw,t,1-naa.6001405ba48e0b99e4c418ca13506c8e
   Runtime Name: vmhba68:C0:T0:L0
   Device: naa.6001405ba48e0b99e4c418ca13506c8e
   Device Display Name: LIO-ORG iSCSI Disk
(naa.6001405ba48e0b99e4c418ca13506c8e)
   Group State: active unoptimized
   Array Priority: 0
   Storage Array Type Path Config:
{TPG_id=1,TPG_state=ANO,RTP_id=1,RTP_health=UP}
   Path Selection Policy Path Config: {current path; rank: 0}



> esxcli storage core device list -d disk_id
>
:~] esxcli storage core device list -d naa.6001405ba48e0b99e4c418ca13506c8e
naa.6001405ba48e0b99e4c418ca13506c8e
   Display Name: LIO-ORG iSCSI Disk (naa.6001405ba48e0b99e4c418ca13506c8e)
   Has Settable Display Name: true
   Size: 5242880
   Device Type: Direct-Access
   Multipath Plugin: NMP
   Devfs Path: /vmfs/devices/disks/naa.6001405ba48e0b99e4c418ca13506c8e
   Vendor: LIO-ORG
   Model: TCMU device
   Revision: 0002
   SCSI Level: 5
   Is Pseudo: false
   Status: degraded
   Is RDM Capable: true
   Is Local: false
   Is Removable: false
   Is SSD: false
   Is VVOL PE: false
   Is Offline: false
   Is Perennially Reserved: false
   Queue Full Sample Size: 0
   Queue Full Threshold: 0
   Thin Provisioning Status: yes
   Attached Filters:
   VAAI Status: supported
   Other UIDs: vml.02000000006001405ba48e0b99e4c418ca13506c8e54434d552064
   Is Shared Clusterwide: true
   Is SAS: false
   Is USB: false
   Is Boot Device: false
   Device Max Queue Depth: 128
   No of outstanding IOs with competing worlds: 32
   Drive Type: unknown
   RAID Level: unknown
   Number of Physical Drives: unknown
   Protection Enabled: false
   PI Activated: false
   PI Type: 0
   PI Protection Mask: NO PROTECTION
   Supported Guard Types: NO GUARD SUPPORT
   DIX Enabled: false
   DIX Guard Type: NO GUARD SUPPORT
   Emulated DIX/DIF Enabled: false


> esxcli storage nmp device list -d disk_id
>
esxcli storage nmp device list -d naa.6001405ba48e0b99e4c418ca13506c8e
naa.6001405ba48e0b99e4c418ca13506c8e
   Device Display Name: LIO-ORG iSCSI Disk
(naa.6001405ba48e0b99e4c418ca13506c8e)
   Storage Array Type: VMW_SATP_ALUA
   Storage Array Type Device Config: {implicit_support=on;
explicit_support=off; explicit_allow=on; alua_followover=on;
action_OnRetryErrors=on; {TPG_id=1,TPG_state=ANO}}
   Path Selection Policy: VMW_PSP_MRU
   Path Selection Policy Device Config: Current Path=vmhba68:C0:T0:L0
   Path Selection Policy Device Custom Config:
   Working Paths: vmhba68:C0:T0:L0
   Is USB: false


> esxcli storage nmp satp list
>
> esxcli storage nmp satp list
Name                 Default PSP    Description
-------------------  -------------
 --------------------------------------------------------------------------------
VMW_SATP_ALUA        VMW_PSP_MRU    Supports non-specific arrays that use
the ALUA protocol
VMW_SATP_MSA         VMW_PSP_MRU    Placeholder (plugin not loaded)
VMW_SATP_DEFAULT_AP  VMW_PSP_MRU    Placeholder (plugin not loaded)
VMW_SATP_SVC         VMW_PSP_FIXED  Placeholder (plugin not loaded)
VMW_SATP_EQL         VMW_PSP_FIXED  Placeholder (plugin not loaded)
VMW_SATP_INV         VMW_PSP_FIXED  Placeholder (plugin not loaded)
VMW_SATP_EVA         VMW_PSP_FIXED  Placeholder (plugin not loaded)
VMW_SATP_ALUA_CX     VMW_PSP_RR     Placeholder (plugin not loaded)
VMW_SATP_SYMM        VMW_PSP_RR     Placeholder (plugin not loaded)
VMW_SATP_CX          VMW_PSP_MRU    Placeholder (plugin not loaded)
VMW_SATP_LSI         VMW_PSP_MRU    Placeholder (plugin not loaded)
VMW_SATP_DEFAULT_AA  VMW_PSP_FIXED  Supports non-specific active/active
arrays
VMW_SATP_LOCAL       VMW_PSP_FIXED  Supports direct attached devices



> and the /var/log/vmkernel.log for when you stop a node and the image
> goes to the unavailable state.
>
>
> 2020-02-02T00:14:59.094Z cpu2:2098080)WARNING: NMP:
nmpDeviceAttemptFailover:640: Retry world failover device
"naa.6001405ba48e0b99e4c418ca13506c8e" - issuing command 0x459a9abd2380
2020-02-02T00:14:59.095Z cpu2:2098080)WARNING: NMP:
nmpDeviceAttemptFailover:715: Retry world failover device
"naa.6001405ba48e0b99e4c418ca13506c8e" - failed to issue command due to Not
found (APD), try again...
2020-02-02T00:14:59.095Z cpu2:2098080)WARNING: NMP:
nmpDeviceAttemptFailover:765: Logical device
"naa.6001405ba48e0b99e4c418ca13506c8e": awaiting fast path state update...
2020-02-02T00:15:01.095Z cpu2:2098080)WARNING: NMP:
nmpDeviceAttemptFailover:640: Retry world failover device
"naa.6001405cb71b882378c4138826c2da30" - issuing command 0x45a283482940
2020-02-02T00:15:01.095Z cpu2:2098080)WARNING: NMP:
nmpDeviceAttemptFailover:715: Retry world failover device
"naa.6001405cb71b882378c4138826c2da30" - failed to issue command due to Not
found (APD), try again...
2020-02-02T00:15:01.095Z cpu2:2098080)WARNING: NMP:
nmpDeviceAttemptFailover:765: Logical device
"naa.6001405cb71b882378c4138826c2da30": awaiting fast path state update...
2020-02-02T00:15:06.979Z cpu25:2124996)ScsiDeviceIO: 3449:
Cmd(0x459a9abd2380) 0x9e, CmdSN 0x1dc0cf from world 0 to dev
"naa.6001405ba48e0b99e4c418ca13506c8e" failed H:0x5 D:0x0 P:0x0 Invalid
sense data: 0x46 0x80 0x41.
2020-02-02T00:15:06.979Z cpu25:2124996)WARNING: NMP:
nmp_DeviceStartLoop:729: NMP Device "naa.6001405ba48e0b99e4c418ca13506c8e"
is blocked. Not starting I/O from device.
2020-02-02T00:15:07.094Z cpu2:2098080)WARNING: NMP:
nmpDeviceAttemptFailover:603: Retry world restore device
"naa.6001405ba48e0b99e4c418ca13506c8e" - no more commands to retry
2020-02-02T00:15:07.095Z cpu33:2098078)WARNING: NMP:
nmp_IssueCommandToDevice:5726: I/O could not be issued to device
"naa.6001405ba48e0b99e4c418ca13506c8e" due to Not found
2020-02-02T00:15:07.095Z cpu33:2098078)WARNING: NMP:
nmp_DeviceRetryCommand:133: Device "naa.6001405ba48e0b99e4c418ca13506c8e":
awaiting fast path state update for failover with I/O blocked. No prior
reservation exists on the device.
2020-02-02T00:15:07.095Z cpu33:2098078)WARNING: NMP:
nmp_DeviceStartLoop:729: NMP Device "naa.6001405ba48e0b99e4c418ca13506c8e"
is blocked. Not starting I/O from device.
2020-02-02T00:15:08.095Z cpu2:2098080)WARNING: NMP:
nmpDeviceAttemptFailover:640: Retry world failover device
"naa.6001405ba48e0b99e4c418ca13506c8e" - issuing command 0x459a9abd2380
2020-02-02T00:15:08.095Z cpu2:2098080)WARNING: NMP:
nmpDeviceAttemptFailover:715: Retry world failover device
"naa.6001405ba48e0b99e4c418ca13506c8e" - failed to issue command due to Not
found (APD), try again...
2020-02-02T00:15:08.095Z cpu2:2098080)WARNING: NMP:
nmpDeviceAttemptFailover:765: Logical device
"naa.6001405ba48e0b99e4c418ca13506c8e": awaiting fast path state update...
2020-02-02T00:15:08.154Z cpu25:2124996)ScsiDeviceIO: 3449:
Cmd(0x45a283482940) 0x12, CmdSN 0x1dc10c from world 0 to dev
"naa.6001405cb71b882378c4138826c2da30" failed H:0x5 D:0x0 P:0x0 Invalid
sense data: 0x46 0x80 0x41.
2020-02-02T00:15:08.154Z cpu25:2124996)WARNING: NMP:
nmp_DeviceStartLoop:729: NMP Device "naa.6001405cb71b882378c4138826c2da30"
is blocked. Not starting I/O from device.
2020-02-02T00:15:09.095Z cpu2:2098080)WARNING: NMP:
nmpDeviceAttemptFailover:640: Retry world failover device
"naa.6001405ba48e0b99e4c418ca13506c8e" - issuing command 0x459a9abd2380
2020-02-02T00:15:09.095Z cpu2:2098080)WARNING: NMP:
nmpDeviceAttemptFailover:715: Retry world failover device
"naa.6001405ba48e0b99e4c418ca13506c8e" - failed to issue command due to Not
found (APD), try again...
2020-02-02T00:15:09.095Z cpu2:2098080)WARNING: NMP:
nmpDeviceAttemptFailover:765: Logical device
"naa.6001405ba48e0b99e4c418ca13506c8e": awaiting fast path state update...
2020-02-02T00:15:09.095Z cpu2:2098080)WARNING: NMP:
nmpDeviceAttemptFailover:603: Retry world restore device
"naa.6001405cb71b882378c4138826c2da30" - no more commands to retry
2020-02-02T00:15:14.094Z cpu22:2098078)NMP:
nmp_ResetDeviceLogThrottling:3575: last error status from device
naa.6001405cb71b882378c4138826c2da30 repeated 35 times
2020-02-02T00:15:14.095Z cpu22:2098078)NMP:
nmp_ResetDeviceLogThrottling:3575: last error status from device
naa.6001405ba48e0b99e4c418ca13506c8e repeated 8 times
2020-02-02T00:15:39.094Z cpu27:2098080)WARNING: NMP:
nmpDeviceAttemptFailover:640: Retry world failover device
"naa.6001405ba48e0b99e4c418ca13506c8e" - issuing command 0x459a9abd2380
2020-02-02T00:15:39.095Z cpu27:2098080)WARNING: NMP:
nmpDeviceAttemptFailover:715: Retry world failover device
"naa.6001405ba48e0b99e4c418ca13506c8e" - failed to issue command due to Not
found (APD), try again...
2020-02-02T00:15:39.095Z cpu27:2098080)WARNING: NMP:
nmpDeviceAttemptFailover:765: Logical device
"naa.6001405ba48e0b99e4c418ca13506c8e": awaiting fast path state update...
2020-02-02T00:15:46.981Z cpu25:2124996)ScsiDeviceIO: 3449:
Cmd(0x459a9abd2380) 0x25, CmdSN 0x1dc10d from world 0 to dev
"naa.6001405ba48e0b99e4c418ca13506c8e" failed H:0x5 D:0x0 P:0x0 Invalid
sense data: 0x0 0x0 0x0.
2020-02-02T00:15:46.981Z cpu25:2124996)WARNING: NMP:
nmp_DeviceStartLoop:729: NMP Device "naa.6001405ba48e0b99e4c418ca13506c8e"
is blocked. Not starting I/O from device.
2020-02-02T00:15:47.094Z cpu20:2098080)WARNING: NMP:
nmpDeviceAttemptFailover:603: Retry world restore device
"naa.6001405ba48e0b99e4c418ca13506c8e" - no more commands to retry
2020-02-02T00:15:47.095Z cpu22:2098078)WARNING: NMP:
nmp_IssueCommandToDevice:5726: I/O could not be issued to device
"naa.6001405ba48e0b99e4c418ca13506c8e" due to Not found
2020-02-02T00:15:47.095Z cpu22:2098078)WARNING: NMP:
nmp_DeviceRetryCommand:133: Device "naa.6001405ba48e0b99e4c418ca13506c8e":
awaiting fast path state update for failover with I/O blocked. No prior
reservation exists on the device.
2020-02-02T00:15:47.095Z cpu22:2098078)WARNING: NMP:
nmp_DeviceStartLoop:729: NMP Device "naa.6001405ba48e0b99e4c418ca13506c8e"
is blocked. Not starting I/O from device.
2020-02-02T00:15:48.094Z cpu20:2098080)WARNING: NMP:
nmpDeviceAttemptFailover:640: Retry world failover device
"naa.6001405ba48e0b99e4c418ca13506c8e" - issuing command 0x459a9ba62b40
2020-02-02T00:15:48.095Z cpu20:2098080)WARNING: NMP:
nmpDeviceAttemptFailover:715: Retry world failover device
"naa.6001405ba48e0b99e4c418ca13506c8e" - failed to issue command due to Not
found (APD), try again...
2020-02-02T00:15:48.095Z cpu20:2098080)WARNING: NMP:
nmpDeviceAttemptFailover:765: Logical device
"naa.6001405ba48e0b99e4c418ca13506c8e": awaiting fast path state update...
2020-02-02T00:15:49.095Z cpu20:2098080)WARNING: NMP:
nmpDeviceAttemptFailover:640: Retry world failover device
"naa.6001405ba48e0b99e4c418ca13506c8e" - issuing command 0x459a9ba62b40
2020-02-02T00:15:49.095Z cpu20:2098080)WARNING: NMP:
nmpDeviceAttemptFailover:715: Retry world failover device
"naa.6001405ba48e0b99e4c418ca13506c8e" - failed to issue command due to Not
found (APD), try again...
2020-02-02T00:15:49.095Z cpu20:2098080)WARNING: NMP:
nmpDeviceAttemptFailover:765: Logical device
"naa.6001405ba48e0b99e4c418ca13506c8e": awaiting fast path state update...
2020-02-02T00:15:52.967Z cpu0:2098255)iscsi_vmk:
iscsivmk_ConnNetRegister:2170: socket 0x430e49d44a90 network resource pool
netsched.pools.persist.iscsi associated
2020-02-02T00:15:52.967Z cpu0:2098255)iscsi_vmk:
iscsivmk_ConnNetRegister:2198: socket 0x430e49d44a90 network tracker id
193945594 tracker.iSCSI.192.168.201.1 associated
2020-02-02T00:15:54.712Z cpu0:2098255)WARNING: iscsi_vmk:
iscsivmk_StartConnection:880: vmhba68:CH:0 T:0 CN:0: iSCSI connection is
being marked "ONLINE"
2020-02-02T00:15:54.712Z cpu0:2098255)WARNING: iscsi_vmk:
iscsivmk_StartConnection:881: Sess [ISID: 00023d000001 TARGET:
iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw TPGT: 1 TSIH: 0]
2020-02-02T00:15:54.712Z cpu0:2098255)WARNING: iscsi_vmk:
iscsivmk_StartConnection:882: Conn [CID: 0 L: 192.168.201.107:35264 R:
192.168.201.1:3260]
2020-02-02T00:15:54.712Z cpu30:2097595)ScsiDevice: 4481: Handle REPORTED
LUNS CHANGED DATA unit attention
2020-02-02T00:15:54.712Z cpu30:2097595)ScsiDevice: 4512: Handle INQUIRY
PARAMETERS CHANGED unit attention
2020-02-02T00:15:54.716Z cpu2:2097312)ScsiDevice: 6001: Setting Device
naa.6001405ba48e0b99e4c418ca13506c8e state back to 0x2
2020-02-02T00:15:54.716Z cpu2:2097312)ScsiDevice: 8708: No Handlers
registered! (naa.6001405ba48e0b99e4c418ca13506c8e)!
2020-02-02T00:15:54.716Z cpu2:2097312)ScsiDevice: 6022: Device
naa.6001405ba48e0b99e4c418ca13506c8e is Out of APD; token num:1
2020-02-02T00:15:54.716Z cpu2:2097312)StorageApdHandler: 1315: APD exit for
0x4305de04a9d0 [naa.6001405ba48e0b99e4c418ca13506c8e]
2020-02-02T00:15:54.716Z cpu0:2097602)StorageApdHandler: 507: APD exit
event for 0x4305de04a9d0 [naa.6001405ba48e0b99e4c418ca13506c8e]
2020-02-02T00:15:54.716Z cpu0:2097602)StorageApdHandlerEv: 117: Device or
filesystem with identifier [naa.6001405ba48e0b99e4c418ca13506c8e] has
exited the All Paths Down state.
2020-02-02T00:15:54.717Z cpu11:2097618)ScsiDevice: 6001: Setting Device
naa.6001405cb71b882378c4138826c2da30 state back to 0x2
2020-02-02T00:15:54.717Z cpu11:2097618)ScsiDevice: 8708: No Handlers
registered! (naa.6001405cb71b882378c4138826c2da30)!
2020-02-02T00:15:54.717Z cpu11:2097618)ScsiDevice: 6022: Device
naa.6001405cb71b882378c4138826c2da30 is Out of APD; token num:1
2020-02-02T00:15:54.717Z cpu11:2097618)StorageApdHandler: 1315: APD exit
for 0x4305de041660 [naa.6001405cb71b882378c4138826c2da30]
2020-02-02T00:15:54.717Z cpu0:2097602)StorageApdHandler: 507: APD exit
event for 0x4305de041660 [naa.6001405cb71b882378c4138826c2da30]
2020-02-02T00:15:54.717Z cpu0:2097602)StorageApdHandlerEv: 117: Device or
filesystem with identifier [naa.6001405cb71b882378c4138826c2da30] has
exited the All Paths Down state.
2020-02-02T00:15:55.096Z cpu0:2098243)NMP: nmpCompleteRetryForPath:327:
Retry world recovered device "naa.6001405ba48e0b99e4c418ca13506c8e"
2020-02-02T00:15:55.096Z cpu8:2099418 opID=17fe6cc5)World: 11943: VC opID
sps-Main-52618-962-fe-1-62ed maps to vmkernel opID 17fe6cc5
2020-02-02T00:15:55.096Z cpu8:2099418 opID=17fe6cc5)WARNING: ScsiDeviceIO:
10750: READ CAPACITY on device "naa.6001405ba48e0b99e4c418ca13506c8e" from
Plugin "NMP" failed. I/O error
2020-02-02T00:15:55.097Z cpu3:2098243)NMP: nmp_ThrottleLogForDevice:3788:
Cmd 0x28 (0x459a9abd2380, 0) to dev "naa.6001405ba48e0b99e4c418ca13506c8e"
on path "vmhba68:C0:T0:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x2
0x4 0xa. Act:FAILOVER
2020-02-02T00:15:55.097Z cpu3:2098243)WARNING: NMP:
nmp_DeviceRetryCommand:133: Device "naa.6001405ba48e0b99e4c418ca13506c8e":
awaiting fast path state update for failover with I/O blocked. No prior
reservation exists on the device.
2020-02-02T00:15:56.095Z cpu27:2098080)WARNING: NMP:
nmpDeviceAttemptFailover:640: Retry world failover device
"naa.6001405ba48e0b99e4c418ca13506c8e" - issuing command 0x459a9abd2380
2020-02-02T00:15:56.099Z cpu3:2098243)NMP: nmpCompleteRetryForPath:327:
Retry world recovered device "naa.6001405ba48e0b99e4c418ca13506c8e"
2020-02-02T00:16:02.095Z cpu23:2097316)qfle3:
qfle3_queue_alloc_with_attr:517: [vmnic0]
QueueOps.qfle3_queue_alloc_with_attr num_attr :1 attrs: 0x451a4521bcd0
2020-02-02T00:16:02.095Z cpu23:2097316)qfle3:
qfle3_queue_alloc_with_attr:545: [vmnic0] Feature LRO requested.
2020-02-02T00:16:02.095Z cpu23:2097316)qfle3: qfle3_rq_alloc:282: [vmnic0]
allocating  RX queue at 1
2020-02-02T00:16:02.096Z cpu23:2097316)qfle3: qfle3_rq_alloc:316: [vmnic0]
Marking RX queue 1 IN_USE
2020-02-02T00:16:02.096Z cpu23:2097316)qfle3: qfle3_queue_start:1529:
[vmnic0] QueueOps.queueStart
2020-02-02T00:16:02.096Z cpu23:2097316)qfle3: qfle3_queue_start:1537:
[vmnic0] RxQ, QueueIDVal:1
2020-02-02T00:16:02.096Z cpu23:2097316)qfle3: qfle3_rx_queue_start:1452:
[vmnic0] qfle3_rx_queue_start, QueueIDVal:1
2020-02-02T00:16:02.096Z cpu23:2097316)qfle3: qfle3_rq_start:1287: [vmnic0]
qfle3_rq_start 1
2020-02-02T00:16:02.096Z cpu23:2097316)qfle3: qfle3_init_eth_fp:68:
[vmnic0] qfle3_init_eth_fp for fp 1
2020-02-02T00:16:02.096Z cpu23:2097316)qfle3: qfle3_rq_start:1334: [vmnic0]
RX fp[1]: wrote prods bd_prod=4078 cqe_prod=4030 sge_prod=1024
2020-02-02T00:16:02.096Z cpu23:2097316)qfle3: qfle3_rq_start:1351: [vmnic0]
enabled netpoll for q_index 1
2020-02-02T00:16:02.096Z cpu23:2097316)qfle3: qfle3_rq_start:1357: [vmnic0]
Enabling interrupt on vector # 3
2020-02-02T00:16:02.097Z cpu23:2097316)qfle3: qfle3_rq_start:1377: [vmnic0]
RX queue setup_queue successful for 1
2020-02-02T00:16:02.097Z cpu23:2097316)qfle3: qfle3_rq_start:1411: [vmnic0]
active Rx queue Count 2
2020-02-02T00:16:02.097Z cpu23:2097316)qfle3: qfle3_rq_start:1412: [vmnic0]
RX queue 1 successfully started
2020-02-02T00:16:02.098Z cpu23:2097316)qfle3:
qfle3_queue_remove_filter:2063: [vmnic0] QueueOps.queueRemoveFilter
2020-02-02T00:16:02.099Z cpu23:2097316)qfle3:
qfle3_remove_queue_filter:2012: [vmnic0] NetQ removed RX filter: queue:0
mac: 00:50:56:6f:59:4f filter id:3
2020-02-02T00:16:02.099Z cpu23:2097316)qfle3:
qfle3_queue_apply_filter:1923: [vmnic0] QueueOps.queueApplyFilter 1
2020-02-02T00:16:02.101Z cpu23:2097316)qfle3:
qfle3_apply_queue_mac_filter:1798: [vmnic0] NetQ set RX filter: queue:1
mac: 00:50:56:6f:59:4f filter id:0
2020-02-02T00:16:04.924Z cpu0:2099418 opID=cff7ff5)World: 11943: VC opID
sps-Main-52618-962-d0-d6-6321 maps to vmkernel opID cff7ff5
2020-02-02T00:16:04.924Z cpu0:2099418 opID=cff7ff5)Unmap6: 7133: [Unmap]
'datastore1':device(0x43078fc9a1f0)does not support unmap
2020-02-02T00:16:07.164Z cpu9:2271849)J6: 2651: 'Storage_Ceph_pool1':
Exiting async journal replay manager world
2020-02-02T00:16:09.665Z cpu12:2271850)J6: 2651: 'Storage_Ceph_pool2':
Exiting async journal replay manager world
2020-02-02T00:16:10.930Z cpu4:2271852)J6: 2651: 'datastore1': Exiting async
journal replay manager world
2020-02-02T00:16:11.094Z cpu22:2271853)J6: 2651: 'SSD1': Exiting async
journal replay manager world
2020-02-02T00:16:22.094Z cpu26:2097316)qfle3:
qfle3_queue_remove_filter:2063: [vmnic0] QueueOps.queueRemoveFilter
2020-02-02T00:16:22.097Z cpu26:2097316)qfle3:
qfle3_remove_queue_filter:2012: [vmnic0] NetQ removed RX filter: queue:1
mac: 00:50:56:6f:59:4f filter id:0
2020-02-02T00:16:22.098Z cpu26:2097316)qfle3:
qfle3_queue_apply_filter:1923: [vmnic0] QueueOps.queueApplyFilter 0
2020-02-02T00:16:22.099Z cpu26:2097316)qfle3:
qfle3_apply_queue_mac_filter:1798: [vmnic0] NetQ set RX filter: queue:0
mac: 00:50:56:6f:59:4f filter id:3
2020-02-02T00:16:22.099Z cpu26:2097316)qfle3: qfle3_queue_quiesce:1061:
[vmnic0] QueueOps.queueQuiesce
2020-02-02T00:16:22.099Z cpu26:2097316)qfle3: qfle3_queue_quiesce:1069:
[vmnic0] RxQ, QueueIDVal:1
2020-02-02T00:16:22.099Z cpu26:2097316)qfle3: qfle3_rx_queue_stop:1558:
[vmnic0] qfle3_rx_queue_stop, QueueIDVal:1
2020-02-02T00:16:22.099Z cpu26:2097316)qfle3: qfle3_rq_stop:740: [vmnic0]
qfle3_rq_stop 1
2020-02-02T00:16:22.099Z cpu26:2097316)qfle3: qfle3_rq_stop:811: [vmnic0]
Stopping queue 0
2020-02-02T00:16:22.102Z cpu26:2097316)qfle3: qfle3_rq_stop:831: [vmnic0]
disable netpoll for q_index 1
2020-02-02T00:16:22.102Z cpu26:2097316)qfle3: qfle3_rq_stop:842: [vmnic0]
Disabling interrupt on vector # 3
2020-02-02T00:16:22.102Z cpu26:2097316)qfle3: qfle3_rq_stop:867: [vmnic0]
active Rx queue Count 1
2020-02-02T00:16:22.102Z cpu26:2097316)qfle3: qfle3_queue_free:690:
[vmnic0] QueueOps.queueFree
2020-02-02T00:16:22.102Z cpu26:2097316)qfle3: qfle3_queue_free:697:
[vmnic0] RxQ, QueueIDVal:1
2020-02-02T00:16:22.102Z cpu26:2097316)qfle3: qfle3_rq_free:618: [vmnic0]
Loop through 1 RSS queues 1
2020-02-02T00:16:22.102Z cpu26:2097316)qfle3: qfle3_cmd_remove_q:19191:
[vmnic0] Releasing Q idx 1


2020-02-02T00:17:05.925Z cpu4:2100126 opID=4a500647)World: 11943: VC opID
sps-Main-52618-962-3a-dd-633d maps to vmkernel opID 4a500647
2020-02-02T00:17:05.925Z cpu4:2100126 opID=4a500647)Unmap6: 7133: [Unmap]
'datastore1':device(0x43078fc99b10)does not support unmap
2020-02-02T00:17:11.930Z cpu15:2271855)J6: 2651: 'datastore1': Exiting
async journal replay manager world
2020-02-02T00:17:12.100Z cpu34:2271856)J6: 2651: 'SSD1': Exiting async
journal replay manager world





>
> >
> > Nothing is recorded in the logs. Can you help me?
> >
> > Em qui., 26 de dez. de 2019 às 17:44, Mike Christie <mchristi@xxxxxxxxxx
> > <mailto:mchristi@xxxxxxxxxx>> escreveu:
> >
> >     On 12/24/2019 06:40 AM, Gesiel Galvão Bernardes wrote:
> >     > In addition: I turned off one of the GWs, and with just one it
> works
> >     > fine. When the two go up, one of the images is changing the
> "active /
> >     > optimized" all time (where generates the logs above) and
> everything is
> >     > extremely slow.
> >
> >     Your multipathing in ESX is probably misconfigured and you have set
> it
> >     up for active active, or one host can't see all the iscsi paths
> either
> >     because it's not logged into all the sessions or because the network
> is
> >     not up on one of the paths.
> >
> >
> >     >
> >     > I'm using:
> >     > tcmu-runner-1.4
> >     > ceph-iscsi-3.3
> >     > ceph 13.2.7
> >     >
> >     > Regards,
> >     > Gesiel
> >     >
> >     > Em ter., 24 de dez. de 2019 às 09:09, Gesiel Galvão Bernardes
> >     > <gesiel.bernardes@xxxxxxxxx <mailto:gesiel.bernardes@xxxxxxxxx>
> >     <mailto:gesiel.bernardes@xxxxxxxxx
> >     <mailto:gesiel.bernardes@xxxxxxxxx>>> escreveu:
> >     >
> >     >     Hi,
> >     >
> >     >     I am having an unusual slowdown using VMware with ISCSI gws. I
> >     have
> >     >     two ISCSI gateways with two RBD images. I have checked the
> >     following
> >     >     in the logs:
> >     >
> >     >     Dec 24 09:00:26 ceph-iscsi2 tcmu-runner: 2019-12-24
> >     09:00:26.040 969
> >     >     [INFO] alua_implicit_transition:562 rbd/pool1.vmware_iscsi1:
> >     >     Starting lock acquisition operation.2019-12-24 09:00:26.040 969
> >     >     [INFO] alua_implicit_transition:557 rbd/pool1.vmware_iscsi1:
> Lock
> >     >     acquisition operation is already in process.2019-12-24
> >     09:00:26.973
> >     >     969 [WARN] tcmu_rbd_lock:744 rbd/pool1.vmware_iscsi1: Acquired
> >     >     exclusive lock.
> >     >     Dec 24 09:00:26 ceph-iscsi2 tcmu-runner: tcmu_rbd_lock:744
> >     >     rbd/pool1.vmware_iscsi1: Acquired exclusive lock.
> >     >     Dec 24 09:00:28 ceph-iscsi2 tcmu-runner: 2019-12-24
> >     09:00:28.099 969
> >     >     [WARN] tcmu_notify_lock_lost:201 rbd/pool1.vmware_iscsi1:
> >     Async lock
> >     >     drop. Old state 1
> >     >     Dec 24 09:00:28 ceph-iscsi2 tcmu-runner:
> tcmu_notify_lock_lost:201
> >     >     rbd/pool1.vmware_iscsi1: Async lock drop. Old state 1
> >     >     Dec 24 09:00:28 ceph-iscsi2 tcmu-runner:
> >     >     alua_implicit_transition:562 rbd/pool1.vmware_iscsi1: Starting
> >     lock
> >     >     acquisition operation.
> >     >     Dec 24 09:00:28 ceph-iscsi2 tcmu-runner: 2019-12-24
> >     09:00:28.824 969
> >     >     [INFO] alua_implicit_transition:562 rbd/pool1.vmware_iscsi1:
> >     >     Starting lock acquisition operation.2019-12-24 09:00:28.990 969
> >     >     [WARN] tcmu_rbd_lock:744 rbd/pool1.vmware_iscsi1: Acquired
> >     exclusive
> >     >     lock.
> >     >     Dec 24 09:00:28 ceph-iscsi2 tcmu-runner: tcmu_rbd_lock:744
> >     >     rbd/pool1.vmware_iscsi1: Acquired exclusive lock.
> >     >
> >     >
> >     >     Can anyone help-me please?
> >     >
> >     >     Gesiel
> >     >
> >     >
> >     >
> >     >
> >     > _______________________________________________
> >     > ceph-users mailing list -- ceph-users@xxxxxxx
> >     <mailto:ceph-users@xxxxxxx>
> >     > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >     <mailto:ceph-users-leave@xxxxxxx>
> >     >
> >
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx





[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux