Re: Global power failure, OpenStack Nova/libvirt/KVM, and Ceph RBD locks

Jason Dillaman <jdillama@xxxxxxxxxx> · Tue, 19 Nov 2019 15:32:01 -0500

On Tue, Nov 19, 2019 at 2:49 PM Florian Haas <florian@xxxxxxxxxxxxxx> wrote:
>
> On 19/11/2019 20:03, Jason Dillaman wrote:
> > On Tue, Nov 19, 2019 at 1:51 PM shubjero <shubjero@xxxxxxxxx> wrote:
> >>
> >> Florian,
> >>
> >> Thanks for posting about this issue. This is something that we have
> >> been experiencing (stale exclusive locks) with our OpenStack and Ceph
> >> cloud more frequently as our datacentre has had some reliability
> >> issues recently with power and cooling causing several unexpected
> >> shutdowns.
> >>
> >> At this point we are on Ceph Mimic 13.2.6 and reading through this
> >> thread and related links I just wanted to confirm if I have the
> >> correct caps for cinder clients as listed below as we have upgraded
> >> through many major Ceph versions over the years and I'm sure a lot of
> >> our configs and settings still contain deprecated options.
> >>
> >> client.cinder
> >> key: sanitized==
> >> caps: [mgr] allow r
> >> caps: [mon] profile rbd
> >> caps: [osd] allow class-read object_prefix rbd_children, profile rbd
> >> pool=volumes, profile rbd pool=vms, profile rbd pool=images
> >
> > Only use "profile rbd" for 'mon' and 'osd' caps -- it's documented
> > here [1]. Once you use 'profile rbd', you don't need the extra "allow
> > class-read object_prefix rbd_children" since it is included within the
> > profile (along with other things like support for clone v2). Octopus
> > will also include "profile rbd" for the 'mgr' cap to support the new
> > functionality in the "rbd_support" manager module (like running "rbd
> > perf image top" w/o the admin caps).
> >
> >> From what I read, the blacklist permission was something that was
> >> supposed to be applied pre-Luminous upgrade but once you are on
> >> Luminous or later, it's no longer needed assuming you have switched to
> >> using the rbd profile.
> >
> > Correct. The "blacklist" permission was an intermediate state
> > pre-upgrade since your older OSDs wouldn't have support for "profile
> > rbd" yet but Luminous OSDs started to enforce caps on the 'blacklist
> > add' op so that rogue users w/ read-only permissions couldn't just
> > blacklist all clients. Once you are at Luminous or later, you can just
> > use the profile.
>
> OK, great. This gives me something to start with for a doc patch.
> Thanks! However, I'm still curious about this bit:
>
> >> On Fri, Nov 15, 2019 at 11:05 AM Paul Emmerich <paul.emmerich@xxxxxxxx> wrote:
> >>> * This is unrelated to openstack and will happen with *any* reasonably
> >>> configured hypervisor that uses exclusive locking
>
> What, exactly, is the "reasonably configured hypervisor" here, in other
> words, what is it that grabs and releases this lock? It's evidently not
> Nova that does this, but is it libvirt, or Qemu/KVM, and if so, what
> magic in there makes this happen, and what "reasonable configuration"
> influences this?

librbd and krbd perform this logic when the exclusive-lock feature is
enabled. In this case, librbd sees that the previous lock owner is
dead / missing, but before it can steal the lock (since librbd did not
cleanly close the image), it needs to ensure it cannot come back from
the dead to issue future writes against the RBD image by blacklisting
it from the cluster.

> Thanks again!
>
> Cheers,
> Florian
>

-- 
Jason

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com