Hello! Our team finally had a chance to take another look at the problem identified by Brian Felton in http://tracker.ceph.com/issues/16767. Basically, if any multipart
objects are retried before an Abort or Complete, they remain on the system, taking up space and leaving their accounting in “radosgw-admin bucket stats”. The problem is confirmed in Hammer and Jewel. This past week, we succeeded in some experimental code to remove those parts. I am not sure if this code has any unintended consequences, so **I would greatly appreciate reviews of the new tool**! I have
tested it successfully against objects created and leaked in the ceph-demo Docker image for Jewel. Here is a pull request with the patch: https://github.com/ceph/ceph/pull/17349 Basically, we added a new subcommand for “bucket” called “fixmpleak”. This lists objects in the “multipart” namespace, and it identifies objects that are not associated with current .meta files in that list.
It then deletes those objects with a delete op, which results in the accounting being corrected and the space being reclaimed on the OSDs. This is not a preventative measure, which would be a lot more complicated, but we figure to run this tool hourly against all our buckets to keep things clean. |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com