On Tue, Nov 19, 2019 at 4:31 PM Florian Haas <florian@xxxxxxxxxxxxxx> wrote: > > On 19/11/2019 22:19, Jason Dillaman wrote: > > On Tue, Nov 19, 2019 at 4:09 PM Florian Haas <florian@xxxxxxxxxxxxxx> wrote: > >> > >> On 19/11/2019 21:32, Jason Dillaman wrote: > >>>> What, exactly, is the "reasonably configured hypervisor" here, in other > >>>> words, what is it that grabs and releases this lock? It's evidently not > >>>> Nova that does this, but is it libvirt, or Qemu/KVM, and if so, what > >>>> magic in there makes this happen, and what "reasonable configuration" > >>>> influences this? > >>> > >>> librbd and krbd perform this logic when the exclusive-lock feature is > >>> enabled. > >> > >> Right. So the "reasonable configuration" applies to the features they > >> enable when they *create* an image, rather than what they do to the > >> image at runtime. Is that fair to say? > > > > The exclusive-lock ownership is enforced at image use (i.e. when the > > feature is a property of the image, not specifically just during the > > action of enabling the property) -- so this implies "what they do to > > the image at runtime" > > OK, gotcha. > > >>> In this case, librbd sees that the previous lock owner is > >>> dead / missing, but before it can steal the lock (since librbd did not > >>> cleanly close the image), it needs to ensure it cannot come back from > >>> the dead to issue future writes against the RBD image by blacklisting > >>> it from the cluster. > >> > >> Thanks. I'm probably sounding dense here, sorry for that, but yes, this > >> makes perfect sense to me when I want to fence a whole node off — > >> however, how exactly does this work with VM recovery in place? > > > > How would librbd / krbd know under what situation a VM was being > > "recovered"? Should librbd be expected to integrate w/ IPMI devices > > where the VM is being run or w/ Zabbix alert monitoring to know that > > this was a power failure so don't expect that the lock owner will come > > back up? The safe and generic thing for librbd / krbd to do in this > > situation is to just blacklist the old lock owner to ensure it cannot > > talk to the cluster. Obviously in the case of a physically failed > > node, that won't ever happen -- but I think we can all agree this is > > the sane recovery path that covers all bases. > > Oh totally, I wasn't arguing it was a bad idea for it to do what it > does! I just got confused by the fact that our mon logs showed what > looked like a (failed) attempt to blacklist an entire client IP address. There should have been an associated client nonce after the IP address to uniquely identify which client connection is blacklisted -- something like "1.2.3.4:0/5678". Let me know if that's not the case since that would definitely be wrong. > > Yup, with the correct permissions librbd / rbd will be able to > > blacklist the lock owner, break the old lock, and acquire the lock > > themselves for R/W operations -- and the operator would not need to > > intervene. > > Ack. Thanks! > > Cheers, > Florian > -- Jason _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com