Re: ceph-iscsi lock ping pong

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 14/12/2022 14:52, Stolte, Felix wrote:
Issue is resolved now. After verifying that all esx hosts are configured for MRU, i took a closer look on the paths on each host.

`gwcli` reported lun in question was owned by gateway A, but one esx host used the path to gateway B for I/O. I reconfigured that particular host and it’s now using the correct path to gateway A. Logs are clean now and I/O on that Datastore is back to normal.

Yeah.

When the exsi client sent IOs to gateway B, the gateway B will try to acquire the exclusive lock and then ceph will blocklist the current owner, which is gateway A, of it after succeeding.

This is why you were seeing the gateways were blocklisting each other.


This was probably caused by an outage of one of our gateways last week (the physical host, not the daemon), where the iSCSI Daemon didn’t shut down cleanly.

One last question though:

From my understanding the "Dynamic Discovery“ just creates the „Static Discovery“ Targets for all available gateways. Is it also responsible for telling the client, which path to use (aka which gateway is the owner of a LUN)?

In linux initiator I know the multipath will correctly configure the paths' priority by checking the configuration from gateways together with the multipath setting locally.

Not sure how the exsi will behave exactly.

BRs

- Xiubo

---------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Volker Rieke
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Dr. Astrid Lambrecht, Prof. Dr. Frauke Melchior
---------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------

Am 13.12.2022 um 13:21 schrieb Xiubo Li <xiubli@xxxxxxxxxx>:


On 13/12/2022 18:57, Stolte, Felix wrote:
Hi Xiubo,

Thx for pointing me into the right direction. All involved esx host seem to use the correct policy. I am going to detach the LUN on each host one by one until i found the host causing the problem.

From the logs it means the client was switching the path in turn.

BTW, what's policy are you using ?

Thanks

- Xiubo

Regards Felix
---------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Volker Rieke
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Dr. Astrid Lambrecht, Prof. Dr. Frauke Melchior
---------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------

Am 12.12.2022 um 13:03 schrieb Xiubo Li <xiubli@xxxxxxxxxx>:

Hi Stolte,

For the VMware config could you refer to : https://docs.ceph.com/en/latest/rbd/iscsi-initiator-esx/ ?

What's the "Path Selection Policy with ALUA" you are using ? The ceph-iscsi couldn't implement the real AA, so if you use the RR I think it will be like this.

- Xiubo

On 12/12/2022 17:45, Stolte, Felix wrote:
Hi guys,

we are using ceph-iscsi to provide block storage for Microsoft Exchange and vmware vsphere. Ceph docs state that you need to configure Windows iSCSI Initatior for fail-over-only but there is no such point for vmware. In my tcmu-runner logs on both ceph-iscsi gateways I see the following:

2022-12-12 10:36:06.978 33789 [WARN] tcmu_notify_lock_lost:222 rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
2022-12-12 10:36:06.993 33789 [INFO] alua_implicit_transition:570 rbd/mailbox.vmdk_junet_sata: Starting lock acquisition operation.
2022-12-12 10:36:08.064 33789 [WARN] tcmu_rbd_lock:762 rbd/mailbox.vmdk_junet_sata: Acquired exclusive lock.
2022-12-12 10:36:09.067 33789 [WARN] tcmu_notify_lock_lost:222 rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
2022-12-12 10:36:09.071 33789 [INFO] alua_implicit_transition:570 rbd/mailbox.vmdk_junet_sata: Starting lock acquisition operation.
2022-12-12 10:36:10.109 33789 [WARN] tcmu_rbd_lock:762 rbd/mailbox.vmdk_junet_sata: Acquired exclusive lock.
2022-12-12 10:36:11.104 33789 [WARN] tcmu_notify_lock_lost:222 rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
2022-12-12 10:36:11.106 33789 [INFO] alua_implicit_transition:570 rbd/mailbox.vmdk_junet_sata: Starting lock acquisition operation.

At the same time there are these log entries in ceph.audit.logs:
2022-12-12T10:36:06.731621+0100 mon.mon-k2-1 (mon.1) 3407851 : audit [INF] from='client.? 10.100.8.55:0/2392201639' entity='client.admin' cmd=[{"prefix": "osd blocklist", "blocklistop": "add", "addr": "10
.100.8.56:0/1598475844"}]: dispatch
2022-12-12T10:36:06.731913+0100 mon.mon-e2-1 (mon.0) 783726 : audit [INF] from='client.? ' entity='client.admin' cmd=[{"prefix": "osd blocklist", "blocklistop": "add", "addr": "10.100.8.56:0/1598475844"}]
: dispatch
2022-12-12T10:36:06.905082+0100 mon.mon-e2-1 (mon.0) 783727 : audit [INF] from='client.? ' entity='client.admin' cmd='[{"prefix": "osd blocklist", "blocklistop": "add", "addr": "10.100.8.56:0/1598475844"}
]': finished

Can someone explaint to me, what is happening? Why are the gateways blacklisting each other? All involved daemons are running Version 16.2.10. ceph-iscsi gateways are running on Ubuntu 20.04 with ceph-isci package from the Ubuntu repo (all other packers came directly from ceph.com<http://ceph.com/>)


regards Felix

---------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Volker Rieke
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Dr. Astrid Lambrecht, Prof. Dr. Frauke Melchior
---------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------


_______________________________________________
ceph-users mailing list --ceph-users@xxxxxxx
To unsubscribe send an email toceph-users-leave@xxxxxxx


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux