I've proposed some new radosgw-admin commands for both identifying and fixing these leftover index entries in this open PR: https://github.com/ceph/ceph/pull/51700 Cory ________________________________ From: Mark Nelson <mark.nelson@xxxxxxxxx> Sent: Wednesday, May 31, 2023 10:42 AM To: ceph-users@xxxxxxx <ceph-users@xxxxxxx> Subject: Re: RGW versioned bucket index issues Thank you Cory for this excellent write up! A quick question: Is there a simple method to find and more importantly fix the zombie index entries and OLH objects? I saw in https: //urldefense. com/v3/__https: //tracker. ceph. com/issues/59663__;!!J0dtj8f0ZRU!jQa1-QLVWrJY5uzYRRlQcHUBsz-SXCIgyDC6Z8QLZqhtBwtIscjRFfA5XKAZPydCLywOqLni4aXUyUzWvQ$ ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. <https://us-phishalarm-ewt.proofpoint.com/EWT/v1/J0dtj8f0ZRU!jBOFJOY7SMYWVpfyiJMUGCnYT1GRN_W_yhBa0heL8swsY4PmqHwsCQPCNZNrX4cPRqduk6g0t2kWOHc0VPQeosWuKCHvtw$> Report Suspicious ZjQcmQRYFpfptBannerEnd Thank you Cory for this excellent write up! A quick question: Is there a simple method to find and more importantly fix the zombie index entries and OLH objects? I saw in https://urldefense.com/v3/__https://tracker.ceph.com/issues/59663__;!!J0dtj8f0ZRU!jQa1-QLVWrJY5uzYRRlQcHUBsz-SXCIgyDC6Z8QLZqhtBwtIscjRFfA5XKAZPydCLywOqLni4aXUyUzWvQ$ that there was an example using radosgw-admin to examine the lifecycle/marker/garbage collection info, but that looks a little cumbersome? Mark On 5/31/23 05:16, Cory Snyder wrote: > Hi all, > > I wanted to call attention to some RGW issues that we've observed on a > Pacific cluster over the past several weeks. The problems relate to versioned > buckets and index entries that can be left behind after transactions complete > abnormally. The scenario is multi-faceted and we're still investigating some of > the details, but I wanted to provide a big-picture summary of what we've found > so far. It looks like most of these issues should be reproducible on versions > before and after Pacific as well. I'll enumerate the individual issues below: > > 1. PUT requests during reshard of versioned bucket fail with 404 and leave > behind dark data > > Tracker: https://urldefense.com/v3/__https://tracker.ceph.com/issues/61359__;!!J0dtj8f0ZRU!jQa1-QLVWrJY5uzYRRlQcHUBsz-SXCIgyDC6Z8QLZqhtBwtIscjRFfA5XKAZPydCLywOqLni4aWArnEHYw$ > > 2. When bucket index ops are cancelled it can leave behind zombie index entries > > This one was merged a few months ago and did make the v16.2.13 release, but > in our case we had billions of extra index entries by the time that we had > upgraded to the patched version. > > Tracker: https://urldefense.com/v3/__https://tracker.ceph.com/issues/58673__;!!J0dtj8f0ZRU!jQa1-QLVWrJY5uzYRRlQcHUBsz-SXCIgyDC6Z8QLZqhtBwtIscjRFfA5XKAZPydCLywOqLni4aVmbj0i2g$ > > 3. Issuing a delete for a key that already has a delete marker as the current > version leaves behind index entries and OLH objects > > Note that the tracker's original description describes the problem a bit > differently, but I've clarified the nature of the issue in a comment. > > Tracker: https://urldefense.com/v3/__https://tracker.ceph.com/issues/59663__;!!J0dtj8f0ZRU!jQa1-QLVWrJY5uzYRRlQcHUBsz-SXCIgyDC6Z8QLZqhtBwtIscjRFfA5XKAZPydCLywOqLni4aXUyUzWvQ$ > > The extra index entries and OLH objects that are left behind due to these sorts > of issues are obviously annoying in regards to the fact that they unnecessarily > consume space, but we've found that they can also cause severe performance > degradation for bucket listings, lifecycle processing, and other ops indirectly > due to higher osd latencies. > > The reason for the performance impact is that bucket listing calls must > repeatedly perform additional OSD ops until they find the requisite number > of entries to return. The OSD cls method for bucket listing also does its own > internal iteration for the same purpose. Since these entries are invalid, they > are skipped. In the case that we observed, where some of our bucket indexes were > filled with a sea of contiguous leftover entries, the process of continually > iterating over and skipping invalid entries caused enormous read amplification. > I believe that the following tracker is describing symptoms that are related to > the same issue: https://urldefense.com/v3/__https://tracker.ceph.com/issues/59164__;!!J0dtj8f0ZRU!jQa1-QLVWrJY5uzYRRlQcHUBsz-SXCIgyDC6Z8QLZqhtBwtIscjRFfA5XKAZPydCLywOqLni4aU7OS3f7g$. > > Note that this can also cause LC processing to repeatedly fail in cases where > there are enough contiguous invalid entries, since the OSD cls code eventually > gives up and returns an error that isn't handled. > > The severity of these issues likely varies greatly based upon client behavior. > If anyone has experienced similar problems, we'd love to hear about the nature > of how they've manifested for you so that we can be more confident that we've > plugged all of the holes. > > Thanks, > > Cory Snyder > 11:11 Systems > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx -- Best Regards, Mark Nelson Head of R&D (USA) Clyso GmbH p: +49 89 21552391 12 a: Loristraße 8 | 80335 München | Germany w: https://urldefense.com/v3/__https://clyso.com__;!!J0dtj8f0ZRU!jQa1-QLVWrJY5uzYRRlQcHUBsz-SXCIgyDC6Z8QLZqhtBwtIscjRFfA5XKAZPydCLywOqLni4aUM65Ji8Q$ | e: mark.nelson@xxxxxxxxx We are hiring: https://urldefense.com/v3/__https://www.clyso.com/jobs/__;!!J0dtj8f0ZRU!jQa1-QLVWrJY5uzYRRlQcHUBsz-SXCIgyDC6Z8QLZqhtBwtIscjRFfA5XKAZPydCLywOqLni4aV7IbfJ0A$ _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx