Thank you. This is exactly what I was looking for. If I’m understanding correctly, what gets listed as the “owner” is what gets advertised via ALUA as the primary path, but the lock owner indicates which gateway currently owns the lock for that image and is allowed to pass traffic for that LUN, correct? BTW - it appears there is some other kind of bug. I’m using cephadm for bringing the iscsi gateways up/down. Right now I only have two that are configured and ‘ceph orch ls’ only shows two as expected: [root@cxcto-c240-j27-01 ~]# ceph orch ls NAME PORTS RUNNING REFRESHED AGE PLACEMENT alertmanager ?:9093,9094 1/1 8m ago 2M count:1 crash 15/15 8m ago 2M * grafana ?:3000 1/1 8m ago 2M count:1 iscsi.iscsi 2/2 8m ago 81m cxcto-c240-j27-02.cisco.com;cxcto-c240-j27-03.cisco.com mgr 2/2 8m ago 2M count:2 mon 5/5 8m ago 5d cxcto-c240-j27-01.cisco.com;cxcto-c240-j27-06.cisco.com;cxcto-c240-j27-08.cisco.com;cxcto-c240-j27-10.cisco.com;cxcto-c240-j27-12.cisco.com node-exporter ?:9100 15/15 8m ago 2M * osd.dashboard-admin-1622750977792 0/15 - 2M * osd.dashboard-admin-1622751032319 326/341 8m ago 2M * prometheus ?:9095 1/1 8m ago 2M count:1 However, the gwcli command is still showing the other two gateways which are no longer enabled anymore. Where does this list of gateways get stored? It appears that the two gateways that are no longer part of the cluster still appear as the owners of some of the LUNs: /iscsi-targets> ls o- iscsi-targets ................................................................................. [DiscoveryAuth: CHAP, Targets: 3] o- iqn.2001-07.com.ceph:1622752075720 .................................................................. [Auth: CHAP, Gateways: 4] | o- disks ............................................................................................................ [Disks: 5] | | o- iscsi-pool-0001/iscsi-p0001-img-01 ........................................... [Owner: cxcto-c240-j27-02.cisco.com, Lun: 0] | | o- iscsi-pool-0001/iscsi-p0001-img-02 ........................................... [Owner: cxcto-c240-j27-04.cisco.com, Lun: 3] | | o- iscsi-pool-0003/iscsi-p0003-img-01 ........................................... [Owner: cxcto-c240-j27-03.cisco.com, Lun: 1] | | o- iscsi-pool-0003/iscsi-p0003-img-02 ........................................... [Owner: cxcto-c240-j27-05.cisco.com, Lun: 4] | | o- iscsi-pool-0005/iscsi-p0005-img-01 ........................................... [Owner: cxcto-c240-j27-02.cisco.com, Lun: 2] | o- gateways .............................................................................................. [Up: 2/4, Portals: 4] | | o- cxcto-c240-j27-02.cisco.com ......................................................................... [10.122.242.197 (UP)] | | o- cxcto-c240-j27-03.cisco.com ......................................................................... [10.122.242.198 (UP)] | | o- cxcto-c240-j27-04.cisco.com .................................................................... [10.122.242.199 (UNKNOWN)] | | o- cxcto-c240-j27-05.cisco.com .................................................................... [10.122.242.200 (UNKNOWN)] | o- host-groups .................................................................................................... [Groups : 0] | o- hosts ........................................................................................ [Auth: ACL_DISABLED, Hosts: 0] o- iqn.2001-07.com.ceph:1622752147345 .................................................................. [Auth: CHAP, Gateways: 4] | o- disks ............................................................................................................ [Disks: 5] | | o- iscsi-pool-0002/iscsi-p0002-img-01 ........................................... [Owner: cxcto-c240-j27-04.cisco.com, Lun: 0] | | o- iscsi-pool-0002/iscsi-p0002-img-02 ........................................... [Owner: cxcto-c240-j27-02.cisco.com, Lun: 3] | | o- iscsi-pool-0004/iscsi-p0004-img-01 ........................................... [Owner: cxcto-c240-j27-05.cisco.com, Lun: 1] | | o- iscsi-pool-0004/iscsi-p0004-img-02 ........................................... [Owner: cxcto-c240-j27-03.cisco.com, Lun: 4] | | o- iscsi-pool-0006/iscsi-p0006-img-01 ........................................... [Owner: cxcto-c240-j27-03.cisco.com, Lun: 2] | o- gateways .............................................................................................. [Up: 2/4, Portals: 4] | | o- cxcto-c240-j27-02.cisco.com ......................................................................... [10.122.242.197 (UP)] | | o- cxcto-c240-j27-03.cisco.com ......................................................................... [10.122.242.198 (UP)] | | o- cxcto-c240-j27-04.cisco.com .................................................................... [10.122.242.199 (UNKNOWN)] | | o- cxcto-c240-j27-05.cisco.com .................................................................... [10.122.242.200 (UNKNOWN)] | o- host-groups .................................................................................................... [Groups : 0] | o- hosts ........................................................................................ [Auth: ACL_DISABLED, Hosts: 0] o- iqn.2001-07.com.ceph:1627307422533 .................................................................. [Auth: CHAP, Gateways: 4] o- disks ............................................................................................................ [Disks: 1] | o- iscsi-pool-0007/iscsi-p0007-img-01 ........................................... [Owner: cxcto-c240-j27-04.cisco.com, Lun: 0] o- gateways .............................................................................................. [Up: 2/4, Portals: 4] | o- cxcto-c240-j27-02.cisco.com ......................................................................... [10.122.242.197 (UP)] | o- cxcto-c240-j27-03.cisco.com ......................................................................... [10.122.242.198 (UP)] | o- cxcto-c240-j27-04.cisco.com .................................................................... [10.122.242.199 (UNKNOWN)] | o- cxcto-c240-j27-05.cisco.com .................................................................... [10.122.242.200 (UNKNOWN)] o- host-groups .................................................................................................... [Groups : 0] o- hosts ........................................................................................ [Auth: ACL_DISABLED, Hosts: 0] Currently only cxcto-c240-j27-02 and cxcto-c240-j27-03 are enabled, so I would not expect to see cxcto-c240-j27-04 and cxcto-c240-j27-05 as owning some of the LUNs, but as you can see, they are there. Is this a known issue and is there a way to clean this up? Worst-case now that I know how to make sure the ESXi hosts see all the paths, I can just bring back up the other two that I had removed, but was curious is there was a way to clean this up. I’m guessing something is missing in what cephadm does to clean up when it removes a node. -Paul >> Is there a command that lets me view which gateway is primary for which LUN? I’m guessing when another gateway gets added, the calculation of who is primary for each LUN gets re-calculated and advertised out to the clients? >> > In the `gwcli ls` output, such as: > > | o- hosts ....................................................................................... [Auth: ACL_ENABLED, Hosts: 1] > | o- iqn.1994-05.com.redhat:client .................................................. [LOGGED-IN, Auth: None, Disks: 3(1026M)] > | o- lun 0 ............................................................................ [datapool/block0(1G), Owner: node01] > | o- lun 1 ............................................................................ [datapool/block1(1M), Owner: node02] > | o- lun 2 ............................................................................ [datapool/block2(1M), Owner: node01] > > The "Owner: node01" means the gateway node01 is primary for the LUN initially, but this not always true, because if the exclusive lock was lost and acquired by node02. > > Actually we should check the "Lock Owner" instead: > > [root@node01 ~]# gwcli disks/datapool/block2 info > Image .. block2 > Ceph Cluster .. ceph > Pool .. datapool > Wwn .. 7d23f7b4-e0b3-4337-9224-5091513c9d83 > Size H .. 1M > Feature List .. RBD_FEATURE_LAYERING > RBD_FEATURE_EXCLUSIVE_LOCK > RBD_FEATURE_OBJECT_MAP > RBD_FEATURE_FAST_DIFF > RBD_FEATURE_DEEP_FLATTEN > Snapshots .. > Owner .. node01 > Lock Owner .. node02 > State .. Online > Backstore .. user:rbd > Backstore Object Name .. datapool.block2 > Control Values > - hw_max_sectors .. 1024 > - max_data_area_mb .. 8 > - osd_op_timeout .. 30 > - qfull_timeout .. 5 > > The "Lock Owner" above is which gateway the the primary currently is. In linux if you test this via multipath, you will see that the "Owner" always equals to "Lock Owner", except there has path failover. > > - Xiubo > > >> -Paul >> >> >> >>>> >>>> I did a quick test where I re-enabled a second iSCSI gateway to take a closer look at the paths on the ESXi hosts and I definitely see that when the second path becomes available, different hosts are pointing to different gateways for the Active I/O Path. >>>> >>>> I was reading on how ALUA works and as far as I can tell, isn’t CEPH supposed to indicate to the ESXi hosts which iSCSI gateway “owns” a given LUN at any point so that the hosts know which path to make active? >>> >>> Yeah, the ceph-iscsi/tcmu-runner services will do that. It will report this to the clients. >>> >>> >>>> Could there be something wrong where more than one iSCSI gateway is advertising that it owns the LUN to the ESXi hosts? >>>> >>> This has been test and working well in linux in product and the logic never changed for several years. >>> >>> I am not very sure how the ESXi internal will handle this but it should be in compliance with the iscsi proto, in linux the multipath could successfully detect which path is active and will choose it. >>> >>> >>>> -Paul >>>> >> > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx