Hi Mark, thanks a lot! This seems to address the issue we observe, at least to a large degree. I believe we had 2 VMs running after a failed live-migration as well and in this case it doesn't seem like it will help. Maybe its possible to add a bit of logic for this case as well (similar to fencing). My experience was that the write lock moves to the target VM and then there is a reasonable time interval before it is handed back. This might be a sufficient window of opportunity to kill hard a VM that should not run before it acquires the write log again. Thanks for that link! A script template like that could actually be added to the ceph documentation under rbd locks. It seems to be a really important and useful use case for image locking. Best regards, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14 ________________________________________ From: Marc <Marc@xxxxxxxxxxxxxxxxx> Sent: 26 January 2023 18:44:41 To: Frank Schilder; 'ceph-users@xxxxxxx' Subject: RE: Re: Ceph rbd clients surrender exclusive lock in critical situation > > > > Hi all, > > > > we are observing a problem on a libvirt virtualisation cluster that > might come from ceph rbd clients. Something went wrong during execution > of a live-migration operation and as a result we have two instances of > the same VM running on 2 different hosts, the source- and the > destination host. What we observe now is the the exclusive lock of the > RBD disk image moves between these two clients periodically (every few > minutes the owner flips). > > Hi Frank, > > If you are talking about RBD exclusive lock feature ("exclusive-lock" > under "features" in "rbd info" output) then this is expected. This > feature provides automatic cooperative lock transitions between clients > to ensure that only a single client is writing to the image at any > given time. It's there to protect internal per-image data structures > such as the object map, the journal or the client-side PWL (persistent > write log) cache from concurrent modifications in case the image is > opened by two or more clients. The name is confusing but it's NOT > about preventing other clients from opening and writing to the image. > Rather it's about serializing those writes. > I can remember asking this also quite some time ago. Maybe this is helpful https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.wogri.at%2Fscripts%2Fceph-libvirt-locking%2F&data=05%7C01%7Cfrans%40dtu.dk%7C031cb8149ea7428894d308daffc50359%7Cf251f123c9ce448e927734bb285911d9%7C0%7C0%7C638103518897013524%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=LlVnJoaoXdNeRskJqjrjb8BHSibZd1F8r%2FAMK0J1CWA%3D&reserved=0 _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx