Hi, Just now was possible continue this. Below is the information required. Thanks advance, Gesiel Em seg., 20 de jan. de 2020 às 15:06, Mike Christie <mchristi@xxxxxxxxxx> escreveu: > On 01/20/2020 10:29 AM, Gesiel Galvão Bernardes wrote: > > Hi, > > > > Only now have I been able to act on this problem. My environment is > > relatively simple: I have two ESXi 6.7 hosts, connected to two ISCSI > > gateways, using two RBD images. > > > > When this mail started, the workaround was to keep only one ISCSI > > gateway connected, so it works normally. After the answer that received > > here that the problem could be in the configuration of VMWare, I > > reviewed the configuration of both (they are exactly according to Ceph > > documentation), and rebooted both. > > > > Can you give me some basic info. > > The output of: > # "gwcli ls" > o- / ......................................................................................................... [...] o- cluster ......................................................................................... [Clusters: 1] | o- ceph ............................................................................................ [HEALTH_OK] | o- pools .......................................................................................... [Pools: 7] | | o- data_ecpool ................................. [(x3), Commit: 0.00Y/25836962M (0%), Used: 25011763551408b] | | o- pool1 ....................................... [(x2), Commit: 5.0T/38755444M (13%), Used: 16162480333070b] | | o- pool2 ....................................... [(x3), Commit: 6.0T/25836962M (24%), Used: 24261305669809b] | | o- pool3 ........................................ [(4+0), Commit: 0.00Y/58133164M (0%), Used: 719866819603b] | | o- pool_cache ...................................... [(x3), Commit: 0.00Y/706164M (0%), Used: 422549509342b] | | o- poolfs ................................................ [(x3), Commit: 0.00Y/25836962M (0%), Used: 0.00Y] | | o- rbd ................................................... [(x3), Commit: 0.00Y/25836962M (0%), Used: 5737b] | o- topology ............................................................................... [OSDs: 61,MONs: 2] o- disks ....................................................................................... [11.0T, Disks: 3] | o- pool1 ........................................................................................ [pool1 (5.0T)] | | o- vmware_iscsi1 ................................................................ [pool1/vmware_iscsi1 (5.0T)] | o- pool2 ........................................................................................ [pool2 (6.0T)] | o- iscsi-test ...................................................................... [pool2/iscsi-test (1.0T)] | o- vmware_iscsi2 ................................................................ [pool2/vmware_iscsi2 (5.0T)] o- iscsi-targets ............................................................... [DiscoveryAuth: None, Targets: 1] o- iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw ......................................... [Auth: None, Gateways: 2] o- disks .......................................................................................... [Disks: 3] | o- pool1/vmware_iscsi1 ................................................................ [Owner: ceph-iscsi2] | o- pool2/iscsi-test ................................................................... [Owner: ceph-iscsi2] | o- pool2/vmware_iscsi2 ................................................................ [Owner: ceph-iscsi1] o- gateways ............................................................................ [Up: 2/2, Portals: 2] | o- ceph-iscsi1 ........................................................................ [192.168.201.1 (UP)] | o- ceph-iscsi2 ........................................................................ [192.168.201.2 (UP)] o- host-groups .................................................................................. [Groups : 0] o- hosts ....................................................................... [Auth: ACL_ENABLED, Hosts: 2] o- iqn.1994-05.com.redhat:rh7-client .............................. [LOGGED-IN, Auth: CHAP, Disks: 2(10.0T)] | o- lun 0 ................................................. [pool1/vmware_iscsi1(5.0T), Owner: ceph-iscsi2] | o- lun 1 ................................................. [pool2/vmware_iscsi2(5.0T), Owner: ceph-iscsi1] o- iqn.1994-05.com.redhat:tcnvh8 .................................. [LOGGED-IN, Auth: CHAP, Disks: 2(10.0T)] o- lun 0 ................................................. [pool1/vmware_iscsi1(5.0T), Owner: ceph-iscsi2] o- lun 1 ................................................. [pool2/vmware_iscsi2(5.0T), Owner: ceph-iscsi1] > > from one of the iscsi nodes, and give me the output of: > > # targetcli ls > ceph-iscsi1 ~]# targetcli ls Warning: Could not load preferences file /root/.targetcli/prefs.bin. o- / ......................................................................................................... [...] o- backstores .............................................................................................. [...] | o- block .................................................................................. [Storage Objects: 0] | o- fileio ................................................................................. [Storage Objects: 0] | o- pscsi .................................................................................. [Storage Objects: 0] | o- ramdisk ................................................................................ [Storage Objects: 0] | o- user:glfs .............................................................................. [Storage Objects: 0] | o- user:qcow .............................................................................. [Storage Objects: 0] | o- user:rbd ............................................................................... [Storage Objects: 3] | | o- pool1.vmware_iscsi1 ............................ [pool1/vmware_iscsi1;osd_op_timeout=30 (5.0TiB) activated] | | | o- alua ................................................................................... [ALUA Groups: 3] | | | o- ano1 ............................................................... [ALUA state: Active/non-optimized] | | | o- ao ..................................................................... [ALUA state: Active/optimized] | | | o- default_tg_pt_gp ....................................................... [ALUA state: Active/optimized] | | o- pool2.iscsi-test .................................. [pool2/iscsi-test;osd_op_timeout=30 (1.0TiB) activated] | | | o- alua ................................................................................... [ALUA Groups: 3] | | | o- ano1 ............................................................... [ALUA state: Active/non-optimized] | | | o- ao ..................................................................... [ALUA state: Active/optimized] | | | o- default_tg_pt_gp ....................................................... [ALUA state: Active/optimized] | | o- pool2.vmware_iscsi2 ............................ [pool2/vmware_iscsi2;osd_op_timeout=30 (5.0TiB) activated] | | o- alua ................................................................................... [ALUA Groups: 3] | | o- ano2 ............................................................... [ALUA state: Active/non-optimized] | | o- ao ..................................................................... [ALUA state: Active/optimized] | | o- default_tg_pt_gp ....................................................... [ALUA state: Active/optimized] | o- user:zbc ............................................................................... [Storage Objects: 0] o- iscsi ............................................................................................ [Targets: 1] | o- iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw ......................................................... [TPGs: 2] | o- tpg1 .......................................................................... [no-gen-acls, auth per-acl] | | o- acls .......................................................................................... [ACLs: 2] | | | o- iqn.1994-05.com.redhat:rh7-client ........................................ [1-way auth, Mapped LUNs: 2] | | | | o- mapped_lun0 .................................................... [lun0 user/pool1.vmware_iscsi1 (rw)] | | | | o- mapped_lun1 .................................................... [lun2 user/pool2.vmware_iscsi2 (rw)] | | | o- iqn.1994-05.com.redhat:tcnvh8 ............................................ [1-way auth, Mapped LUNs: 2] | | | o- mapped_lun0 .................................................... [lun0 user/pool1.vmware_iscsi1 (rw)] | | | o- mapped_lun1 .................................................... [lun2 user/pool2.vmware_iscsi2 (rw)] | | o- luns .......................................................................................... [LUNs: 3] | | | o- lun0 ................................................................ [user/pool1.vmware_iscsi1 (ano1)] | | | o- lun1 ................................................................... [user/pool2.iscsi-test (ano1)] | | | o- lun2 .................................................................. [user/pool2.vmware_iscsi2 (ao)] | | o- portals .................................................................................... [Portals: 1] | | o- 192.168.201.1:3260 ............................................................................... [OK] | o- tpg2 ........................................................................................... [disabled] | o- acls .......................................................................................... [ACLs: 0] | o- luns .......................................................................................... [LUNs: 3] | | o- lun0 .................................................................. [user/pool1.vmware_iscsi1 (ao)] | | o- lun1 ..................................................................... [user/pool2.iscsi-test (ao)] | | o- lun2 ................................................................ [user/pool2.vmware_iscsi2 (ano2)] | o- portals .................................................................................... [Portals: 1] | o- 192.168.201.2:3260 ............................................................................... [OK] o- loopback ......................................................................................... [Targets: 0] ceph-iscsi2 ~]# targetcli ls Warning: Could not load preferences file /root/.targetcli/prefs.bin. o- / ......................................................................................................... [...] o- backstores .............................................................................................. [...] | o- block .................................................................................. [Storage Objects: 0] | o- fileio ................................................................................. [Storage Objects: 0] | o- pscsi .................................................................................. [Storage Objects: 0] | o- ramdisk ................................................................................ [Storage Objects: 0] | o- user:glfs .............................................................................. [Storage Objects: 0] | o- user:qcow .............................................................................. [Storage Objects: 0] | o- user:rbd ............................................................................... [Storage Objects: 3] | | o- pool1.vmware_iscsi1 ............................ [pool1/vmware_iscsi1;osd_op_timeout=30 (5.0TiB) activated] | | | o- alua ................................................................................... [ALUA Groups: 3] | | | o- ano1 ............................................................... [ALUA state: Active/non-optimized] | | | o- ao ..................................................................... [ALUA state: Active/optimized] | | | o- default_tg_pt_gp ....................................................... [ALUA state: Active/optimized] | | o- pool2.iscsi-test .................................. [pool2/iscsi-test;osd_op_timeout=30 (1.0TiB) activated] | | | o- alua ................................................................................... [ALUA Groups: 3] | | | o- ano1 ............................................................... [ALUA state: Active/non-optimized] | | | o- ao ..................................................................... [ALUA state: Active/optimized] | | | o- default_tg_pt_gp ....................................................... [ALUA state: Active/optimized] | | o- pool2.vmware_iscsi2 ............................ [pool2/vmware_iscsi2;osd_op_timeout=30 (5.0TiB) activated] | | o- alua ................................................................................... [ALUA Groups: 3] | | o- ano2 ............................................................... [ALUA state: Active/non-optimized] | | o- ao ..................................................................... [ALUA state: Active/optimized] | | o- default_tg_pt_gp ....................................................... [ALUA state: Active/optimized] | o- user:zbc ............................................................................... [Storage Objects: 0] o- iscsi ............................................................................................ [Targets: 1] | o- iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw ......................................................... [TPGs: 2] | o- tpg1 ........................................................................................... [disabled] | | o- acls .......................................................................................... [ACLs: 0] | | o- luns .......................................................................................... [LUNs: 3] | | | o- lun0 ................................................................ [user/pool1.vmware_iscsi1 (ano1)] | | | o- lun1 ................................................................... [user/pool2.iscsi-test (ano1)] | | | o- lun2 .................................................................. [user/pool2.vmware_iscsi2 (ao)] | | o- portals .................................................................................... [Portals: 1] | | o- 192.168.201.1:3260 ............................................................................... [OK] | o- tpg2 .......................................................................... [no-gen-acls, auth per-acl] | o- acls .......................................................................................... [ACLs: 2] | | o- iqn.1994-05.com.redhat:rh7-client ........................................ [1-way auth, Mapped LUNs: 2] | | | o- mapped_lun0 .................................................... [lun0 user/pool1.vmware_iscsi1 (rw)] | | | o- mapped_lun1 .................................................... [lun2 user/pool2.vmware_iscsi2 (rw)] | | o- iqn.1994-05.com.redhat:tcnvh8 ............................................ [1-way auth, Mapped LUNs: 2] | | o- mapped_lun0 .................................................... [lun0 user/pool1.vmware_iscsi1 (rw)] | | o- mapped_lun1 .................................................... [lun2 user/pool2.vmware_iscsi2 (rw)] | o- luns .......................................................................................... [LUNs: 3] | | o- lun0 .................................................................. [user/pool1.vmware_iscsi1 (ao)] | | o- lun1 ..................................................................... [user/pool2.iscsi-test (ao)] | | o- lun2 ................................................................ [user/pool2.vmware_iscsi2 (ano2)] | o- portals .................................................................................... [Portals: 1] | o- 192.168.201.2:3260 ............................................................................... [OK] o- loopback ......................................................................................... [Targets: 0] > from both iscsi nodes. > > The ceph-iscsi and tcmu-runner versions and did you build them yourself, > get them from the ceph repos or get it from a distro repo. > I use in both gateways: ceph-iscsi-3.3-1.el7.noarch from ceph-iscsi repo ( http://download.ceph.com/ceph-iscsi/3/rpm/el7/noarch) tcmu-runner-1.4.0-0.1.51.geef5115.el7.x86_64 from https://3.chacra.ceph.com/r/tcmu-runner/master/eef511565078fb4e2ed52caaff16e6c7e75ed6c3/centos/7/flavors/default/x86_64/tcmu-runner-1.4.0-0.1.51.geef5115.el7.x86_64.rpm > > > It turns out, now the gateway "ceph-iscsi1" is working. When I turn on > > "ceph-iscsi2" it appears as "Active / Not optimized" for the two RBD > > images (before was an Active / Optimized for each image), and if I turn > > off "ceph-iscsi1" it (ceph-iscs2) remains as "Active / No optimized "and > > images are unavailable. > > On the ESX side can you give me the output of: > > esxcli storage nmp path list -d disk_id > esxcli storage nmp path list -d naa.6001405ba48e0b99e4c418ca13506c8e iqn.1994-05.com.redhat:rh7-client-00023d000001,iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw,t,1-naa.6001405ba48e0b99e4c418ca13506c8e Runtime Name: vmhba68:C0:T0:L0 Device: naa.6001405ba48e0b99e4c418ca13506c8e Device Display Name: LIO-ORG iSCSI Disk (naa.6001405ba48e0b99e4c418ca13506c8e) Group State: active unoptimized Array Priority: 0 Storage Array Type Path Config: {TPG_id=1,TPG_state=ANO,RTP_id=1,RTP_health=UP} Path Selection Policy Path Config: {current path; rank: 0} > esxcli storage core device list -d disk_id > :~] esxcli storage core device list -d naa.6001405ba48e0b99e4c418ca13506c8e naa.6001405ba48e0b99e4c418ca13506c8e Display Name: LIO-ORG iSCSI Disk (naa.6001405ba48e0b99e4c418ca13506c8e) Has Settable Display Name: true Size: 5242880 Device Type: Direct-Access Multipath Plugin: NMP Devfs Path: /vmfs/devices/disks/naa.6001405ba48e0b99e4c418ca13506c8e Vendor: LIO-ORG Model: TCMU device Revision: 0002 SCSI Level: 5 Is Pseudo: false Status: degraded Is RDM Capable: true Is Local: false Is Removable: false Is SSD: false Is VVOL PE: false Is Offline: false Is Perennially Reserved: false Queue Full Sample Size: 0 Queue Full Threshold: 0 Thin Provisioning Status: yes Attached Filters: VAAI Status: supported Other UIDs: vml.02000000006001405ba48e0b99e4c418ca13506c8e54434d552064 Is Shared Clusterwide: true Is SAS: false Is USB: false Is Boot Device: false Device Max Queue Depth: 128 No of outstanding IOs with competing worlds: 32 Drive Type: unknown RAID Level: unknown Number of Physical Drives: unknown Protection Enabled: false PI Activated: false PI Type: 0 PI Protection Mask: NO PROTECTION Supported Guard Types: NO GUARD SUPPORT DIX Enabled: false DIX Guard Type: NO GUARD SUPPORT Emulated DIX/DIF Enabled: false > esxcli storage nmp device list -d disk_id > esxcli storage nmp device list -d naa.6001405ba48e0b99e4c418ca13506c8e naa.6001405ba48e0b99e4c418ca13506c8e Device Display Name: LIO-ORG iSCSI Disk (naa.6001405ba48e0b99e4c418ca13506c8e) Storage Array Type: VMW_SATP_ALUA Storage Array Type Device Config: {implicit_support=on; explicit_support=off; explicit_allow=on; alua_followover=on; action_OnRetryErrors=on; {TPG_id=1,TPG_state=ANO}} Path Selection Policy: VMW_PSP_MRU Path Selection Policy Device Config: Current Path=vmhba68:C0:T0:L0 Path Selection Policy Device Custom Config: Working Paths: vmhba68:C0:T0:L0 Is USB: false > esxcli storage nmp satp list > > esxcli storage nmp satp list Name Default PSP Description ------------------- ------------- -------------------------------------------------------------------------------- VMW_SATP_ALUA VMW_PSP_MRU Supports non-specific arrays that use the ALUA protocol VMW_SATP_MSA VMW_PSP_MRU Placeholder (plugin not loaded) VMW_SATP_DEFAULT_AP VMW_PSP_MRU Placeholder (plugin not loaded) VMW_SATP_SVC VMW_PSP_FIXED Placeholder (plugin not loaded) VMW_SATP_EQL VMW_PSP_FIXED Placeholder (plugin not loaded) VMW_SATP_INV VMW_PSP_FIXED Placeholder (plugin not loaded) VMW_SATP_EVA VMW_PSP_FIXED Placeholder (plugin not loaded) VMW_SATP_ALUA_CX VMW_PSP_RR Placeholder (plugin not loaded) VMW_SATP_SYMM VMW_PSP_RR Placeholder (plugin not loaded) VMW_SATP_CX VMW_PSP_MRU Placeholder (plugin not loaded) VMW_SATP_LSI VMW_PSP_MRU Placeholder (plugin not loaded) VMW_SATP_DEFAULT_AA VMW_PSP_FIXED Supports non-specific active/active arrays VMW_SATP_LOCAL VMW_PSP_FIXED Supports direct attached devices > and the /var/log/vmkernel.log for when you stop a node and the image > goes to the unavailable state. > > > 2020-02-02T00:14:59.094Z cpu2:2098080)WARNING: NMP: nmpDeviceAttemptFailover:640: Retry world failover device "naa.6001405ba48e0b99e4c418ca13506c8e" - issuing command 0x459a9abd2380 2020-02-02T00:14:59.095Z cpu2:2098080)WARNING: NMP: nmpDeviceAttemptFailover:715: Retry world failover device "naa.6001405ba48e0b99e4c418ca13506c8e" - failed to issue command due to Not found (APD), try again... 2020-02-02T00:14:59.095Z cpu2:2098080)WARNING: NMP: nmpDeviceAttemptFailover:765: Logical device "naa.6001405ba48e0b99e4c418ca13506c8e": awaiting fast path state update... 2020-02-02T00:15:01.095Z cpu2:2098080)WARNING: NMP: nmpDeviceAttemptFailover:640: Retry world failover device "naa.6001405cb71b882378c4138826c2da30" - issuing command 0x45a283482940 2020-02-02T00:15:01.095Z cpu2:2098080)WARNING: NMP: nmpDeviceAttemptFailover:715: Retry world failover device "naa.6001405cb71b882378c4138826c2da30" - failed to issue command due to Not found (APD), try again... 2020-02-02T00:15:01.095Z cpu2:2098080)WARNING: NMP: nmpDeviceAttemptFailover:765: Logical device "naa.6001405cb71b882378c4138826c2da30": awaiting fast path state update... 2020-02-02T00:15:06.979Z cpu25:2124996)ScsiDeviceIO: 3449: Cmd(0x459a9abd2380) 0x9e, CmdSN 0x1dc0cf from world 0 to dev "naa.6001405ba48e0b99e4c418ca13506c8e" failed H:0x5 D:0x0 P:0x0 Invalid sense data: 0x46 0x80 0x41. 2020-02-02T00:15:06.979Z cpu25:2124996)WARNING: NMP: nmp_DeviceStartLoop:729: NMP Device "naa.6001405ba48e0b99e4c418ca13506c8e" is blocked. Not starting I/O from device. 2020-02-02T00:15:07.094Z cpu2:2098080)WARNING: NMP: nmpDeviceAttemptFailover:603: Retry world restore device "naa.6001405ba48e0b99e4c418ca13506c8e" - no more commands to retry 2020-02-02T00:15:07.095Z cpu33:2098078)WARNING: NMP: nmp_IssueCommandToDevice:5726: I/O could not be issued to device "naa.6001405ba48e0b99e4c418ca13506c8e" due to Not found 2020-02-02T00:15:07.095Z cpu33:2098078)WARNING: NMP: nmp_DeviceRetryCommand:133: Device "naa.6001405ba48e0b99e4c418ca13506c8e": awaiting fast path state update for failover with I/O blocked. No prior reservation exists on the device. 2020-02-02T00:15:07.095Z cpu33:2098078)WARNING: NMP: nmp_DeviceStartLoop:729: NMP Device "naa.6001405ba48e0b99e4c418ca13506c8e" is blocked. Not starting I/O from device. 2020-02-02T00:15:08.095Z cpu2:2098080)WARNING: NMP: nmpDeviceAttemptFailover:640: Retry world failover device "naa.6001405ba48e0b99e4c418ca13506c8e" - issuing command 0x459a9abd2380 2020-02-02T00:15:08.095Z cpu2:2098080)WARNING: NMP: nmpDeviceAttemptFailover:715: Retry world failover device "naa.6001405ba48e0b99e4c418ca13506c8e" - failed to issue command due to Not found (APD), try again... 2020-02-02T00:15:08.095Z cpu2:2098080)WARNING: NMP: nmpDeviceAttemptFailover:765: Logical device "naa.6001405ba48e0b99e4c418ca13506c8e": awaiting fast path state update... 2020-02-02T00:15:08.154Z cpu25:2124996)ScsiDeviceIO: 3449: Cmd(0x45a283482940) 0x12, CmdSN 0x1dc10c from world 0 to dev "naa.6001405cb71b882378c4138826c2da30" failed H:0x5 D:0x0 P:0x0 Invalid sense data: 0x46 0x80 0x41. 2020-02-02T00:15:08.154Z cpu25:2124996)WARNING: NMP: nmp_DeviceStartLoop:729: NMP Device "naa.6001405cb71b882378c4138826c2da30" is blocked. Not starting I/O from device. 2020-02-02T00:15:09.095Z cpu2:2098080)WARNING: NMP: nmpDeviceAttemptFailover:640: Retry world failover device "naa.6001405ba48e0b99e4c418ca13506c8e" - issuing command 0x459a9abd2380 2020-02-02T00:15:09.095Z cpu2:2098080)WARNING: NMP: nmpDeviceAttemptFailover:715: Retry world failover device "naa.6001405ba48e0b99e4c418ca13506c8e" - failed to issue command due to Not found (APD), try again... 2020-02-02T00:15:09.095Z cpu2:2098080)WARNING: NMP: nmpDeviceAttemptFailover:765: Logical device "naa.6001405ba48e0b99e4c418ca13506c8e": awaiting fast path state update... 2020-02-02T00:15:09.095Z cpu2:2098080)WARNING: NMP: nmpDeviceAttemptFailover:603: Retry world restore device "naa.6001405cb71b882378c4138826c2da30" - no more commands to retry 2020-02-02T00:15:14.094Z cpu22:2098078)NMP: nmp_ResetDeviceLogThrottling:3575: last error status from device naa.6001405cb71b882378c4138826c2da30 repeated 35 times 2020-02-02T00:15:14.095Z cpu22:2098078)NMP: nmp_ResetDeviceLogThrottling:3575: last error status from device naa.6001405ba48e0b99e4c418ca13506c8e repeated 8 times 2020-02-02T00:15:39.094Z cpu27:2098080)WARNING: NMP: nmpDeviceAttemptFailover:640: Retry world failover device "naa.6001405ba48e0b99e4c418ca13506c8e" - issuing command 0x459a9abd2380 2020-02-02T00:15:39.095Z cpu27:2098080)WARNING: NMP: nmpDeviceAttemptFailover:715: Retry world failover device "naa.6001405ba48e0b99e4c418ca13506c8e" - failed to issue command due to Not found (APD), try again... 2020-02-02T00:15:39.095Z cpu27:2098080)WARNING: NMP: nmpDeviceAttemptFailover:765: Logical device "naa.6001405ba48e0b99e4c418ca13506c8e": awaiting fast path state update... 2020-02-02T00:15:46.981Z cpu25:2124996)ScsiDeviceIO: 3449: Cmd(0x459a9abd2380) 0x25, CmdSN 0x1dc10d from world 0 to dev "naa.6001405ba48e0b99e4c418ca13506c8e" failed H:0x5 D:0x0 P:0x0 Invalid sense data: 0x0 0x0 0x0. 2020-02-02T00:15:46.981Z cpu25:2124996)WARNING: NMP: nmp_DeviceStartLoop:729: NMP Device "naa.6001405ba48e0b99e4c418ca13506c8e" is blocked. Not starting I/O from device. 2020-02-02T00:15:47.094Z cpu20:2098080)WARNING: NMP: nmpDeviceAttemptFailover:603: Retry world restore device "naa.6001405ba48e0b99e4c418ca13506c8e" - no more commands to retry 2020-02-02T00:15:47.095Z cpu22:2098078)WARNING: NMP: nmp_IssueCommandToDevice:5726: I/O could not be issued to device "naa.6001405ba48e0b99e4c418ca13506c8e" due to Not found 2020-02-02T00:15:47.095Z cpu22:2098078)WARNING: NMP: nmp_DeviceRetryCommand:133: Device "naa.6001405ba48e0b99e4c418ca13506c8e": awaiting fast path state update for failover with I/O blocked. No prior reservation exists on the device. 2020-02-02T00:15:47.095Z cpu22:2098078)WARNING: NMP: nmp_DeviceStartLoop:729: NMP Device "naa.6001405ba48e0b99e4c418ca13506c8e" is blocked. Not starting I/O from device. 2020-02-02T00:15:48.094Z cpu20:2098080)WARNING: NMP: nmpDeviceAttemptFailover:640: Retry world failover device "naa.6001405ba48e0b99e4c418ca13506c8e" - issuing command 0x459a9ba62b40 2020-02-02T00:15:48.095Z cpu20:2098080)WARNING: NMP: nmpDeviceAttemptFailover:715: Retry world failover device "naa.6001405ba48e0b99e4c418ca13506c8e" - failed to issue command due to Not found (APD), try again... 2020-02-02T00:15:48.095Z cpu20:2098080)WARNING: NMP: nmpDeviceAttemptFailover:765: Logical device "naa.6001405ba48e0b99e4c418ca13506c8e": awaiting fast path state update... 2020-02-02T00:15:49.095Z cpu20:2098080)WARNING: NMP: nmpDeviceAttemptFailover:640: Retry world failover device "naa.6001405ba48e0b99e4c418ca13506c8e" - issuing command 0x459a9ba62b40 2020-02-02T00:15:49.095Z cpu20:2098080)WARNING: NMP: nmpDeviceAttemptFailover:715: Retry world failover device "naa.6001405ba48e0b99e4c418ca13506c8e" - failed to issue command due to Not found (APD), try again... 2020-02-02T00:15:49.095Z cpu20:2098080)WARNING: NMP: nmpDeviceAttemptFailover:765: Logical device "naa.6001405ba48e0b99e4c418ca13506c8e": awaiting fast path state update... 2020-02-02T00:15:52.967Z cpu0:2098255)iscsi_vmk: iscsivmk_ConnNetRegister:2170: socket 0x430e49d44a90 network resource pool netsched.pools.persist.iscsi associated 2020-02-02T00:15:52.967Z cpu0:2098255)iscsi_vmk: iscsivmk_ConnNetRegister:2198: socket 0x430e49d44a90 network tracker id 193945594 tracker.iSCSI.192.168.201.1 associated 2020-02-02T00:15:54.712Z cpu0:2098255)WARNING: iscsi_vmk: iscsivmk_StartConnection:880: vmhba68:CH:0 T:0 CN:0: iSCSI connection is being marked "ONLINE" 2020-02-02T00:15:54.712Z cpu0:2098255)WARNING: iscsi_vmk: iscsivmk_StartConnection:881: Sess [ISID: 00023d000001 TARGET: iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw TPGT: 1 TSIH: 0] 2020-02-02T00:15:54.712Z cpu0:2098255)WARNING: iscsi_vmk: iscsivmk_StartConnection:882: Conn [CID: 0 L: 192.168.201.107:35264 R: 192.168.201.1:3260] 2020-02-02T00:15:54.712Z cpu30:2097595)ScsiDevice: 4481: Handle REPORTED LUNS CHANGED DATA unit attention 2020-02-02T00:15:54.712Z cpu30:2097595)ScsiDevice: 4512: Handle INQUIRY PARAMETERS CHANGED unit attention 2020-02-02T00:15:54.716Z cpu2:2097312)ScsiDevice: 6001: Setting Device naa.6001405ba48e0b99e4c418ca13506c8e state back to 0x2 2020-02-02T00:15:54.716Z cpu2:2097312)ScsiDevice: 8708: No Handlers registered! (naa.6001405ba48e0b99e4c418ca13506c8e)! 2020-02-02T00:15:54.716Z cpu2:2097312)ScsiDevice: 6022: Device naa.6001405ba48e0b99e4c418ca13506c8e is Out of APD; token num:1 2020-02-02T00:15:54.716Z cpu2:2097312)StorageApdHandler: 1315: APD exit for 0x4305de04a9d0 [naa.6001405ba48e0b99e4c418ca13506c8e] 2020-02-02T00:15:54.716Z cpu0:2097602)StorageApdHandler: 507: APD exit event for 0x4305de04a9d0 [naa.6001405ba48e0b99e4c418ca13506c8e] 2020-02-02T00:15:54.716Z cpu0:2097602)StorageApdHandlerEv: 117: Device or filesystem with identifier [naa.6001405ba48e0b99e4c418ca13506c8e] has exited the All Paths Down state. 2020-02-02T00:15:54.717Z cpu11:2097618)ScsiDevice: 6001: Setting Device naa.6001405cb71b882378c4138826c2da30 state back to 0x2 2020-02-02T00:15:54.717Z cpu11:2097618)ScsiDevice: 8708: No Handlers registered! (naa.6001405cb71b882378c4138826c2da30)! 2020-02-02T00:15:54.717Z cpu11:2097618)ScsiDevice: 6022: Device naa.6001405cb71b882378c4138826c2da30 is Out of APD; token num:1 2020-02-02T00:15:54.717Z cpu11:2097618)StorageApdHandler: 1315: APD exit for 0x4305de041660 [naa.6001405cb71b882378c4138826c2da30] 2020-02-02T00:15:54.717Z cpu0:2097602)StorageApdHandler: 507: APD exit event for 0x4305de041660 [naa.6001405cb71b882378c4138826c2da30] 2020-02-02T00:15:54.717Z cpu0:2097602)StorageApdHandlerEv: 117: Device or filesystem with identifier [naa.6001405cb71b882378c4138826c2da30] has exited the All Paths Down state. 2020-02-02T00:15:55.096Z cpu0:2098243)NMP: nmpCompleteRetryForPath:327: Retry world recovered device "naa.6001405ba48e0b99e4c418ca13506c8e" 2020-02-02T00:15:55.096Z cpu8:2099418 opID=17fe6cc5)World: 11943: VC opID sps-Main-52618-962-fe-1-62ed maps to vmkernel opID 17fe6cc5 2020-02-02T00:15:55.096Z cpu8:2099418 opID=17fe6cc5)WARNING: ScsiDeviceIO: 10750: READ CAPACITY on device "naa.6001405ba48e0b99e4c418ca13506c8e" from Plugin "NMP" failed. I/O error 2020-02-02T00:15:55.097Z cpu3:2098243)NMP: nmp_ThrottleLogForDevice:3788: Cmd 0x28 (0x459a9abd2380, 0) to dev "naa.6001405ba48e0b99e4c418ca13506c8e" on path "vmhba68:C0:T0:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x2 0x4 0xa. Act:FAILOVER 2020-02-02T00:15:55.097Z cpu3:2098243)WARNING: NMP: nmp_DeviceRetryCommand:133: Device "naa.6001405ba48e0b99e4c418ca13506c8e": awaiting fast path state update for failover with I/O blocked. No prior reservation exists on the device. 2020-02-02T00:15:56.095Z cpu27:2098080)WARNING: NMP: nmpDeviceAttemptFailover:640: Retry world failover device "naa.6001405ba48e0b99e4c418ca13506c8e" - issuing command 0x459a9abd2380 2020-02-02T00:15:56.099Z cpu3:2098243)NMP: nmpCompleteRetryForPath:327: Retry world recovered device "naa.6001405ba48e0b99e4c418ca13506c8e" 2020-02-02T00:16:02.095Z cpu23:2097316)qfle3: qfle3_queue_alloc_with_attr:517: [vmnic0] QueueOps.qfle3_queue_alloc_with_attr num_attr :1 attrs: 0x451a4521bcd0 2020-02-02T00:16:02.095Z cpu23:2097316)qfle3: qfle3_queue_alloc_with_attr:545: [vmnic0] Feature LRO requested. 2020-02-02T00:16:02.095Z cpu23:2097316)qfle3: qfle3_rq_alloc:282: [vmnic0] allocating RX queue at 1 2020-02-02T00:16:02.096Z cpu23:2097316)qfle3: qfle3_rq_alloc:316: [vmnic0] Marking RX queue 1 IN_USE 2020-02-02T00:16:02.096Z cpu23:2097316)qfle3: qfle3_queue_start:1529: [vmnic0] QueueOps.queueStart 2020-02-02T00:16:02.096Z cpu23:2097316)qfle3: qfle3_queue_start:1537: [vmnic0] RxQ, QueueIDVal:1 2020-02-02T00:16:02.096Z cpu23:2097316)qfle3: qfle3_rx_queue_start:1452: [vmnic0] qfle3_rx_queue_start, QueueIDVal:1 2020-02-02T00:16:02.096Z cpu23:2097316)qfle3: qfle3_rq_start:1287: [vmnic0] qfle3_rq_start 1 2020-02-02T00:16:02.096Z cpu23:2097316)qfle3: qfle3_init_eth_fp:68: [vmnic0] qfle3_init_eth_fp for fp 1 2020-02-02T00:16:02.096Z cpu23:2097316)qfle3: qfle3_rq_start:1334: [vmnic0] RX fp[1]: wrote prods bd_prod=4078 cqe_prod=4030 sge_prod=1024 2020-02-02T00:16:02.096Z cpu23:2097316)qfle3: qfle3_rq_start:1351: [vmnic0] enabled netpoll for q_index 1 2020-02-02T00:16:02.096Z cpu23:2097316)qfle3: qfle3_rq_start:1357: [vmnic0] Enabling interrupt on vector # 3 2020-02-02T00:16:02.097Z cpu23:2097316)qfle3: qfle3_rq_start:1377: [vmnic0] RX queue setup_queue successful for 1 2020-02-02T00:16:02.097Z cpu23:2097316)qfle3: qfle3_rq_start:1411: [vmnic0] active Rx queue Count 2 2020-02-02T00:16:02.097Z cpu23:2097316)qfle3: qfle3_rq_start:1412: [vmnic0] RX queue 1 successfully started 2020-02-02T00:16:02.098Z cpu23:2097316)qfle3: qfle3_queue_remove_filter:2063: [vmnic0] QueueOps.queueRemoveFilter 2020-02-02T00:16:02.099Z cpu23:2097316)qfle3: qfle3_remove_queue_filter:2012: [vmnic0] NetQ removed RX filter: queue:0 mac: 00:50:56:6f:59:4f filter id:3 2020-02-02T00:16:02.099Z cpu23:2097316)qfle3: qfle3_queue_apply_filter:1923: [vmnic0] QueueOps.queueApplyFilter 1 2020-02-02T00:16:02.101Z cpu23:2097316)qfle3: qfle3_apply_queue_mac_filter:1798: [vmnic0] NetQ set RX filter: queue:1 mac: 00:50:56:6f:59:4f filter id:0 2020-02-02T00:16:04.924Z cpu0:2099418 opID=cff7ff5)World: 11943: VC opID sps-Main-52618-962-d0-d6-6321 maps to vmkernel opID cff7ff5 2020-02-02T00:16:04.924Z cpu0:2099418 opID=cff7ff5)Unmap6: 7133: [Unmap] 'datastore1':device(0x43078fc9a1f0)does not support unmap 2020-02-02T00:16:07.164Z cpu9:2271849)J6: 2651: 'Storage_Ceph_pool1': Exiting async journal replay manager world 2020-02-02T00:16:09.665Z cpu12:2271850)J6: 2651: 'Storage_Ceph_pool2': Exiting async journal replay manager world 2020-02-02T00:16:10.930Z cpu4:2271852)J6: 2651: 'datastore1': Exiting async journal replay manager world 2020-02-02T00:16:11.094Z cpu22:2271853)J6: 2651: 'SSD1': Exiting async journal replay manager world 2020-02-02T00:16:22.094Z cpu26:2097316)qfle3: qfle3_queue_remove_filter:2063: [vmnic0] QueueOps.queueRemoveFilter 2020-02-02T00:16:22.097Z cpu26:2097316)qfle3: qfle3_remove_queue_filter:2012: [vmnic0] NetQ removed RX filter: queue:1 mac: 00:50:56:6f:59:4f filter id:0 2020-02-02T00:16:22.098Z cpu26:2097316)qfle3: qfle3_queue_apply_filter:1923: [vmnic0] QueueOps.queueApplyFilter 0 2020-02-02T00:16:22.099Z cpu26:2097316)qfle3: qfle3_apply_queue_mac_filter:1798: [vmnic0] NetQ set RX filter: queue:0 mac: 00:50:56:6f:59:4f filter id:3 2020-02-02T00:16:22.099Z cpu26:2097316)qfle3: qfle3_queue_quiesce:1061: [vmnic0] QueueOps.queueQuiesce 2020-02-02T00:16:22.099Z cpu26:2097316)qfle3: qfle3_queue_quiesce:1069: [vmnic0] RxQ, QueueIDVal:1 2020-02-02T00:16:22.099Z cpu26:2097316)qfle3: qfle3_rx_queue_stop:1558: [vmnic0] qfle3_rx_queue_stop, QueueIDVal:1 2020-02-02T00:16:22.099Z cpu26:2097316)qfle3: qfle3_rq_stop:740: [vmnic0] qfle3_rq_stop 1 2020-02-02T00:16:22.099Z cpu26:2097316)qfle3: qfle3_rq_stop:811: [vmnic0] Stopping queue 0 2020-02-02T00:16:22.102Z cpu26:2097316)qfle3: qfle3_rq_stop:831: [vmnic0] disable netpoll for q_index 1 2020-02-02T00:16:22.102Z cpu26:2097316)qfle3: qfle3_rq_stop:842: [vmnic0] Disabling interrupt on vector # 3 2020-02-02T00:16:22.102Z cpu26:2097316)qfle3: qfle3_rq_stop:867: [vmnic0] active Rx queue Count 1 2020-02-02T00:16:22.102Z cpu26:2097316)qfle3: qfle3_queue_free:690: [vmnic0] QueueOps.queueFree 2020-02-02T00:16:22.102Z cpu26:2097316)qfle3: qfle3_queue_free:697: [vmnic0] RxQ, QueueIDVal:1 2020-02-02T00:16:22.102Z cpu26:2097316)qfle3: qfle3_rq_free:618: [vmnic0] Loop through 1 RSS queues 1 2020-02-02T00:16:22.102Z cpu26:2097316)qfle3: qfle3_cmd_remove_q:19191: [vmnic0] Releasing Q idx 1 2020-02-02T00:17:05.925Z cpu4:2100126 opID=4a500647)World: 11943: VC opID sps-Main-52618-962-3a-dd-633d maps to vmkernel opID 4a500647 2020-02-02T00:17:05.925Z cpu4:2100126 opID=4a500647)Unmap6: 7133: [Unmap] 'datastore1':device(0x43078fc99b10)does not support unmap 2020-02-02T00:17:11.930Z cpu15:2271855)J6: 2651: 'datastore1': Exiting async journal replay manager world 2020-02-02T00:17:12.100Z cpu34:2271856)J6: 2651: 'SSD1': Exiting async journal replay manager world > > > > > Nothing is recorded in the logs. Can you help me? > > > > Em qui., 26 de dez. de 2019 às 17:44, Mike Christie <mchristi@xxxxxxxxxx > > <mailto:mchristi@xxxxxxxxxx>> escreveu: > > > > On 12/24/2019 06:40 AM, Gesiel Galvão Bernardes wrote: > > > In addition: I turned off one of the GWs, and with just one it > works > > > fine. When the two go up, one of the images is changing the > "active / > > > optimized" all time (where generates the logs above) and > everything is > > > extremely slow. > > > > Your multipathing in ESX is probably misconfigured and you have set > it > > up for active active, or one host can't see all the iscsi paths > either > > because it's not logged into all the sessions or because the network > is > > not up on one of the paths. > > > > > > > > > > I'm using: > > > tcmu-runner-1.4 > > > ceph-iscsi-3.3 > > > ceph 13.2.7 > > > > > > Regards, > > > Gesiel > > > > > > Em ter., 24 de dez. de 2019 às 09:09, Gesiel Galvão Bernardes > > > <gesiel.bernardes@xxxxxxxxx <mailto:gesiel.bernardes@xxxxxxxxx> > > <mailto:gesiel.bernardes@xxxxxxxxx > > <mailto:gesiel.bernardes@xxxxxxxxx>>> escreveu: > > > > > > Hi, > > > > > > I am having an unusual slowdown using VMware with ISCSI gws. I > > have > > > two ISCSI gateways with two RBD images. I have checked the > > following > > > in the logs: > > > > > > Dec 24 09:00:26 ceph-iscsi2 tcmu-runner: 2019-12-24 > > 09:00:26.040 969 > > > [INFO] alua_implicit_transition:562 rbd/pool1.vmware_iscsi1: > > > Starting lock acquisition operation.2019-12-24 09:00:26.040 969 > > > [INFO] alua_implicit_transition:557 rbd/pool1.vmware_iscsi1: > Lock > > > acquisition operation is already in process.2019-12-24 > > 09:00:26.973 > > > 969 [WARN] tcmu_rbd_lock:744 rbd/pool1.vmware_iscsi1: Acquired > > > exclusive lock. > > > Dec 24 09:00:26 ceph-iscsi2 tcmu-runner: tcmu_rbd_lock:744 > > > rbd/pool1.vmware_iscsi1: Acquired exclusive lock. > > > Dec 24 09:00:28 ceph-iscsi2 tcmu-runner: 2019-12-24 > > 09:00:28.099 969 > > > [WARN] tcmu_notify_lock_lost:201 rbd/pool1.vmware_iscsi1: > > Async lock > > > drop. Old state 1 > > > Dec 24 09:00:28 ceph-iscsi2 tcmu-runner: > tcmu_notify_lock_lost:201 > > > rbd/pool1.vmware_iscsi1: Async lock drop. Old state 1 > > > Dec 24 09:00:28 ceph-iscsi2 tcmu-runner: > > > alua_implicit_transition:562 rbd/pool1.vmware_iscsi1: Starting > > lock > > > acquisition operation. > > > Dec 24 09:00:28 ceph-iscsi2 tcmu-runner: 2019-12-24 > > 09:00:28.824 969 > > > [INFO] alua_implicit_transition:562 rbd/pool1.vmware_iscsi1: > > > Starting lock acquisition operation.2019-12-24 09:00:28.990 969 > > > [WARN] tcmu_rbd_lock:744 rbd/pool1.vmware_iscsi1: Acquired > > exclusive > > > lock. > > > Dec 24 09:00:28 ceph-iscsi2 tcmu-runner: tcmu_rbd_lock:744 > > > rbd/pool1.vmware_iscsi1: Acquired exclusive lock. > > > > > > > > > Can anyone help-me please? > > > > > > Gesiel > > > > > > > > > > > > > > > _______________________________________________ > > > ceph-users mailing list -- ceph-users@xxxxxxx > > <mailto:ceph-users@xxxxxxx> > > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > <mailto:ceph-users-leave@xxxxxxx> > > > > > > > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx