Re: Ceph rbd clients surrender exclusive lock in critical situation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Mark,

thanks a lot! This seems to address the issue we observe, at least to a large degree.

I believe we had 2 VMs running after a failed live-migration as well and in this case it doesn't seem like it will help. Maybe its possible to add a bit of logic for this case as well (similar to fencing). My experience was that the write lock moves to the target VM and then there is a reasonable time interval before it is handed back. This might be a sufficient window of opportunity to kill hard a VM that should not run before it acquires the write log again.

Thanks for that link! A script template like that could actually be added to the ceph documentation under rbd locks. It seems to be a really important and useful use case for image locking.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Marc <Marc@xxxxxxxxxxxxxxxxx>
Sent: 26 January 2023 18:44:41
To: Frank Schilder; 'ceph-users@xxxxxxx'
Subject: RE:  Re: Ceph rbd clients surrender exclusive lock in critical situation

> >
> > Hi all,
> >
> > we are observing a problem on a libvirt virtualisation cluster that
> might come from ceph rbd clients. Something went wrong during execution
> of a live-migration operation and as a result we have two instances of
> the same VM running on 2 different hosts, the source- and the
> destination host. What we observe now is the the exclusive lock of the
> RBD disk image moves between these two clients periodically (every few
> minutes the owner flips).
>
> Hi Frank,
>
> If you are talking about RBD exclusive lock feature ("exclusive-lock"
> under "features" in "rbd info" output) then this is expected.  This
> feature provides automatic cooperative lock transitions between clients
> to ensure that only a single client is writing to the image at any
> given time.  It's there to protect internal per-image data structures
> such as the object map, the journal or the client-side PWL (persistent
> write log) cache from concurrent modifications in case the image is
> opened by two or more clients.  The name is confusing but it's NOT
> about preventing other clients from opening and writing to the image.
> Rather it's about serializing those writes.
>


I can remember asking this also quite some time ago. Maybe this is helpful

https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.wogri.at%2Fscripts%2Fceph-libvirt-locking%2F&data=05%7C01%7Cfrans%40dtu.dk%7C031cb8149ea7428894d308daffc50359%7Cf251f123c9ce448e927734bb285911d9%7C0%7C0%7C638103518897013524%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=LlVnJoaoXdNeRskJqjrjb8BHSibZd1F8r%2FAMK0J1CWA%3D&reserved=0
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux