Hi Abhishek, Your PR is practical for anyone who can not delete an non-empty bucket. But I think in one scenario that you still can not delete bucket because of the orphan object leaked by multipart upload. Upload a multipart object: 1. Upload part 1 with prefix 2~-DUZmxVbiv9dBycBdci2iMhiKEEUv-5 2. The connection between rgw client and server is cutoff, I don't know if I have upload part one completely ,so I re-upload part 1 then the part 1 is attached with a new prefix like eOntuNHW8UdvnpbLl9UAdYGuWrL9HPH(which is tackled by rgw server ) 3. The connection between rgw client and server is cutoff again, then I re-upload part 1 with new prefix d9H1FWnoOwmr3IAejtYQJ2hyIjsUA7U 4. Upload part 2 and left parts 5. At the end, complete the upload. The object with names below is leaked and I can not see them through s3 client. (My upload size is 15MB and stripe size is 4M) center-master.4439.1__multipart_bigfile.2~-DUZmxVbiv9dBycBdci2iMhiKEEUv-5.1 center-master.4439.1__shadow_bigfile.2~-DUZmxVbiv9dBycBdci2iMhiKEEUv-5.1_1 center-master.4439.1__shadow_bigfile.2~-DUZmxVbiv9dBycBdci2iMhiKEEUv-5.1_2 center-master.4439.1__shadow_bigfile.2~-DUZmxVbiv9dBycBdci2iMhiKEEUv-5.1_3 center-master.4439.1__multipart_bigfile.eOntuNHW8UdvnpbLl9UAdYGuWrL9HPH.1 center-master.4439.1__shadow_bigfile.eOntuNHW8UdvnpbLl9UAdYGuWrL9HPH.1_1 center-master.4439.1__shadow_bigfile.eOntuNHW8UdvnpbLl9UAdYGuWrL9HPH.1_2 center-master.4439.1__shadow_bigfile.eOntuNHW8UdvnpbLl9UAdYGuWrL9HPH.1_3 The above upload process is really occurring in our production application and we find a lot space leaked. How can I reclaim the space leaked? Any Idea is appreciated. ________________________________ penglaiyxy From: Abhishek Varshney Date: 2017-06-13 00:17 To: Ceph Development Subject: rgw: leak with incomplete multiparts (was: Request for review) Reviving an old thread by a colleague on rgw leaking rados objects, the PR submitted earlier [1] had failed teuthology rgw run, due to radosgw-admin failing to remove a user with --purge-data flag. I tried to root cause the issue, and it turned out that incomplete multiparts need to be aborted when doing bucket rm with --purge-data. Here is the new PR (https://github.com/ceph/ceph/pull/15630) which handles incomplete multiparts with behaviour as given below: * radosgw-admin user/bucket rm with incomplete multiparts would return bucket not empty error. * radosgw-admin user/bucket rm --purge-data with incomplete multiparts would abort the pending multiparts and then delete the bucket. * S3 delete bucket API with incomplete multiparts would return bucket not empty error. The expectation here is on the user to either complete or cancel all pending multipart uploads before deleting the bucket. Requesting review on this PR. PS : The check for an empty bucket index here [2] in the previous PR [1] has been removed, as we found instances of inconsistent bucket index with stale entries, without corresponding objects present in data pool. This would have prevented the deletion of an empty bucket with such inconsistent indexes. I am not sure on how to reproduce such a scenario though. [1] https://github.com/ceph/ceph/pull/10920 [2] https://github.com/ceph/ceph/pull/10920/files#diff-c30965955342b98393b73be699f4e355R7349 Thanks Abhishek On Mon, Sep 12, 2016 at 4:23 AM, Praveen Kumar G T (Cloud Platform) <praveen.gt@xxxxxxxxxxxx> wrote: > Definitions > > Orphaned Objects: Orphaned objects are created when a Multipart upload has uploaded some or all of the parts but without executing multipart cancel or multipart complete. An incomplete Multipart upload is neither created nor destroyed, it is in orphan state > Leaked Objects : Objects that are not deleted on the ceph cluster but assumed to be deleted by s3 client. These objects cannot be accessed by s3 clients but still occupy space in the ceph cluster > > Problem > > A s3 bucket cannot be deleted when there are objects in it. The bucket deletion command will fail with error BucketNotEmpty. The objects in the buckets can be listed using any of the s3 clients. In case if we have orphaned objects present in the bucket, they will not be listed via the normal listing operations of the s3 clients. If the bucket is deleted when there are orphaned objects in the bucket, they will end up being leaked. This ends of using space in the ceph cluster even though the objects are not accessed by any of the s3 clients. This space is not accounted under the radosgw user account as well > > Tracker link > > http://tracker.ceph.com/issues/17164 > > Fix > > The fix will avoid deletion of buckets even if there are Orphaned objects in the bucket. So now bucket deletion command will return BucketNotEmpty when there are orphaned objects as well. > > Pull request > > https://github.com/ceph/ceph/pull/10920 > > Can somebody Please review the fix. We have already verified the fix locally. > > Regards, > Praveen. > > > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html