Re: Global power failure, OpenStack Nova/libvirt/KVM, and Ceph RBD locks

Florian Haas <florian@xxxxxxxxxxxxxx> · Tue, 19 Nov 2019 22:09:49 +0100

On 19/11/2019 21:32, Jason Dillaman wrote:
>> What, exactly, is the "reasonably configured hypervisor" here, in other
>> words, what is it that grabs and releases this lock? It's evidently not
>> Nova that does this, but is it libvirt, or Qemu/KVM, and if so, what
>> magic in there makes this happen, and what "reasonable configuration"
>> influences this?
> 
> librbd and krbd perform this logic when the exclusive-lock feature is
> enabled.

Right. So the "reasonable configuration" applies to the features they
enable when they *create* an image, rather than what they do to the
image at runtime. Is that fair to say?

> In this case, librbd sees that the previous lock owner is
> dead / missing, but before it can steal the lock (since librbd did not
> cleanly close the image), it needs to ensure it cannot come back from
> the dead to issue future writes against the RBD image by blacklisting
> it from the cluster.

Thanks. I'm probably sounding dense here, sorry for that, but yes, this
makes perfect sense to me when I want to fence a whole node off —
however, how exactly does this work with VM recovery in place?

>From further upthread:

> Semi-relatedly, as I understand it OSD blacklisting happens based either
> on an IP address, or on a socket address (IP:port). While this comes in
> handy in host evacuation, it doesn't in in-place recovery (see question
> 4 in my original message).
> 
> - If the blacklist happens based on IP address alone (and that's what
> seems to be what the client attempts to be doing, based on our log
> messages), then it would break recovery-in-place after a hard reboot
> altogether.
> 
> - Even if the client would blacklist based on an address:port pair, it
> would be just very unlikely that an RBD client used the same source port
> to connect after the node recovers in place, but not impossible.

Clearly though, if people set their permissions correctly then this
blacklisting seems to work fine even for recovery-in-place, so no reason
for me to doubt that, I'd just really like to understand the mechanics. :)

Thanks again!

Cheers,
Florian
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com