Re: Global power failure, OpenStack Nova/libvirt/KVM, and Ceph RBD locks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Florian,

On 15/11/2019 12:32, Florian Haas wrote:

I received this off-list but then subsequently saw this message pop up
in the list archive, so I hope it's OK to reply on-list?

Of course, I just clicked the wrong reply button the first time.

So that cap was indeed missing, thanks for the hint! However, I am still
trying to understand how this is related to the issue we saw.

I had exactly the same happen to me as happened to you a week or so ago. Compute node lost power and once restored the VMs would start booting but fail early on when they tried to write.

My key was also missing that cap, adding it and resetting the affected VMs was the only action I took to sort things out. I didn't need to go around removing locks by hand as you did. As you say, waiting 30 seconds didn't do any good so it doesn't appear to be a watcher thing.

This was mentioned in the release notes for Luminous[1], I'd missed it too as I redeployed Nautilus instead and skipped these steps:

<snip>

Verify that all RBD client users have sufficient caps to blacklist other client users. RBD client users with only "allow r" monitor caps should be updated as follows:

# ceph auth caps client.<ID> mon 'allow r, allow command "osd blacklist"' osd '<existing OSD caps for user>'

<snip>

Simon

[1] https://docs.ceph.com/docs/master/releases/luminous/#upgrade-from-jewel-or-kraken
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux