Hi, I'm currently investigating a case where Ceph cluster ended up with inconsistent clone information. Here's a what I did to quickly reproduce: * Created new cluster (tested in hammer 0.94.6 and jewel 10.2.3) * Created two pools: test and rbd * Created base image in pool test, created snapshot, protected it and created clone of this snapshot in pool rbd: # rbd -p test create --size 10 --image-format 2 base # rbd -p test snap create base@base # rbd -p test snap protect base@base # rbd clone test/base@base rbd/destination * Created new user called "test" with rwx permissions to rbd pool only: caps: [mon] allow r caps: [osd] allow class-read object_prefix rbd_children, allow rwx pool=rbd * Using this newly creted user I removed the cloned image in rbd pool, had errors but finally removed the image: # rbd --id test -p rbd rm destination 2016-12-21 11:50:03.758221 7f32b7459700 -1 librbd::image::OpenRequest: failed to retreive name: (1) Operation not permitted 2016-12-21 11:50:03.758288 7f32b6c58700 -1 librbd::image::RefreshParentRequest: failed to open parent image: (1) Operation not permitted 2016-12-21 11:50:03.758312 7f32b6c58700 -1 librbd::image::RefreshRequest: failed to refresh parent image: (1) Operation not permitted 2016-12-21 11:50:03.758333 7f32b6c58700 -1 librbd::image::OpenRequest: failed to refresh image: (1) Operation not permitted 2016-12-21 11:50:03.759366 7f32b6c58700 -1 librbd::ImageState: failed to open image: (1) Operation not permitted Removing image: 100% complete...done. At this point there's no cloned image but the original snapshot still has reference to it: # rbd -p test snap unprotect base@base 2016-12-21 11:53:47.359060 7fee037fe700 -1 librbd::SnapshotUnprotectRequest: cannot unprotect: at least 1 child(ren) [29b0238e1f29] in pool 'rbd' 2016-12-21 11:53:47.359678 7fee037fe700 -1 librbd::SnapshotUnprotectRequest: encountered error: (16) Device or resource busy 2016-12-21 11:53:47.359691 7fee037fe700 -1 librbd::SnapshotUnprotectRequest: 0x7fee39ae9340 should_complete_error: ret_val=-16 2016-12-21 11:53:47.360627 7fee037fe700 -1 librbd::SnapshotUnprotectRequest: 0x7fee39ae9340 should_complete_error: ret_val=-16 rbd: unprotecting snap failed: (16) Device or resource busy # rbd -p test children base@base rbd: listing children failed: (2) No such file or directory2016-12-21 11:53:08.716987 7ff2b2eaad80 -1 librbd: Error looking up name for image id 29b0238e1f29 in pool rbd Any ideas on how this could be fixed? Thanks, Bartek _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com