Hi, As a follow-up, this PR for librbd seems to be what needs to be applied to krbd too. As said in the PR, the bug is very much reproducible after Jason Dillaman's suggestion. Regards, Florian Florian Margaine <florian@xxxxxxxxxxx> writes: > Hi, > > We're hitting an odd issue on our ceph cluster: > > - We have machine1 mapping an exclusive-lock RBD. > - Machine2 wants to take a snapshot of the RBD, but fails to take the lock. > > Stracing the rbd snap process on machine2 shows it looping on sending > "lockget" commands, without ever moving forward. > > In rbd status, we see that machine1 is a watcher on the image, which is > expected. What is not expected is that the rbd snap process can't get the > lock. > > This commit deployed in 10.2.10, which we are using, sounds related: > https://github.com/ceph/ceph/commit/475dda114a7e25b43dc9066b9808a64fc0c6dc89 > > But there isn't the equivalent in ceph-client's code, which we would expect > too. That said, I don't have a full understanding, so I might be off-base > there. > > Am I wrong in expecting the equivalent in ceph-client's code? (aka Linux > kernel) Am I completely off-base as to what is wrong there? Can I provide > any additional information to help debugging? > > Regards, > Florian
Attachment:
signature.asc
Description: PGP signature
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com