On Fri, Jan 27, 2023 at 11:21 AM Frank Schilder <frans@xxxxxx> wrote: > > Hi Mark, > > thanks a lot! This seems to address the issue we observe, at least to a large degree. > > I believe we had 2 VMs running after a failed live-migration as well and in this case it doesn't seem like it will help. Maybe its possible to add a bit of logic for this case as well (similar to fencing). My experience was that the write lock moves to the target VM and then there is a reasonable time interval before it is handed back. This might be a sufficient window of opportunity to kill hard a VM that should not run before it acquires the write log again. > > Thanks for that link! A script template like that could actually be added to the ceph documentation under rbd locks. It seems to be a really important and useful use case for image locking. Hi Frank, The script at [1] looks a bit suspicious to me because it uses shared locking (--shared option) and checks whether the image is locked by grepping "rbd lock list" output. There is a bunch of VM states ("migrate", "prepare", etc) and a couple of different lock IDs are employed ("migrate", "startup", "libvirt") so I could be wrong -- such nasty state transitions may just not be possible in libvirt -- but considered purely in isolation the following function lock { rbd=$1 locktype=$2 ... rbd lock add $rbd $locktype --shared libvirt } if is_locked $rbd libvirt then ... exit 257 fi lock $rbd libvirt < presumably VM is allowed to start > could easily allow to start two VMs on the same $rbd image if invoked in parallel on two different nodes. For now, I have just updated the documentation at [2] to highlight and warn about the automatic lock transitions behavior. [1] https://www.wogri.at/scripts/ceph-libvirt-locking/ [2] https://docs.ceph.com/en/quincy/rbd/rbd-exclusive-locks/ Thanks, Ilya > > Best regards, > ================= > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > > ________________________________________ > From: Marc <Marc@xxxxxxxxxxxxxxxxx> > Sent: 26 January 2023 18:44:41 > To: Frank Schilder; 'ceph-users@xxxxxxx' > Subject: RE: Re: Ceph rbd clients surrender exclusive lock in critical situation > > > > > > > Hi all, > > > > > > we are observing a problem on a libvirt virtualisation cluster that > > might come from ceph rbd clients. Something went wrong during execution > > of a live-migration operation and as a result we have two instances of > > the same VM running on 2 different hosts, the source- and the > > destination host. What we observe now is the the exclusive lock of the > > RBD disk image moves between these two clients periodically (every few > > minutes the owner flips). > > > > Hi Frank, > > > > If you are talking about RBD exclusive lock feature ("exclusive-lock" > > under "features" in "rbd info" output) then this is expected. This > > feature provides automatic cooperative lock transitions between clients > > to ensure that only a single client is writing to the image at any > > given time. It's there to protect internal per-image data structures > > such as the object map, the journal or the client-side PWL (persistent > > write log) cache from concurrent modifications in case the image is > > opened by two or more clients. The name is confusing but it's NOT > > about preventing other clients from opening and writing to the image. > > Rather it's about serializing those writes. > > > > > I can remember asking this also quite some time ago. Maybe this is helpful > > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.wogri.at%2Fscripts%2Fceph-libvirt-locking%2F&data=05%7C01%7Cfrans%40dtu.dk%7C031cb8149ea7428894d308daffc50359%7Cf251f123c9ce448e927734bb285911d9%7C0%7C0%7C638103518897013524%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=LlVnJoaoXdNeRskJqjrjb8BHSibZd1F8r%2FAMK0J1CWA%3D&reserved=0 > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx