Re: RGW versioned bucket index issues

Cory Snyder <csnyder@xxxxxxxxxxxxxxx> · Wed, 31 May 2023 15:10:04 +0000

I've proposed some new radosgw-admin commands for both identifying and fixing these leftover index entries in this open PR: https://github.com/ceph/ceph/pull/51700

Cory

________________________________
From: Mark Nelson <mark.nelson@xxxxxxxxx>
Sent: Wednesday, May 31, 2023 10:42 AM
To: ceph-users@xxxxxxx <ceph-users@xxxxxxx>
Subject:  Re: RGW versioned bucket index issues

Thank you Cory for this excellent write up!  A quick question: Is there a simple method to find and more importantly fix the zombie index entries and OLH objects? I saw in https: //urldefense. com/v3/__https: //tracker. ceph. com/issues/59663__;!!J0dtj8f0ZRU!jQa1-QLVWrJY5uzYRRlQcHUBsz-SXCIgyDC6Z8QLZqhtBwtIscjRFfA5XKAZPydCLywOqLni4aXUyUzWvQ$
ZjQcmQRYFpfptBannerStart
This Message Is From an External Sender
This message came from outside your organization.
<https://us-phishalarm-ewt.proofpoint.com/EWT/v1/J0dtj8f0ZRU!jBOFJOY7SMYWVpfyiJMUGCnYT1GRN_W_yhBa0heL8swsY4PmqHwsCQPCNZNrX4cPRqduk6g0t2kWOHc0VPQeosWuKCHvtw$>
Report Suspicious

ZjQcmQRYFpfptBannerEnd

Thank you Cory for this excellent write up!  A quick question: Is there
a simple method to find and more importantly fix the zombie index
entries and OLH objects?

I saw in https://urldefense.com/v3/__https://tracker.ceph.com/issues/59663__;!!J0dtj8f0ZRU!jQa1-QLVWrJY5uzYRRlQcHUBsz-SXCIgyDC6Z8QLZqhtBwtIscjRFfA5XKAZPydCLywOqLni4aXUyUzWvQ$ that there was an example
using radosgw-admin to examine the lifecycle/marker/garbage collection
info, but that looks a little cumbersome?

Mark

On 5/31/23 05:16, Cory Snyder wrote:
> Hi all,
>
> I wanted to call attention to some RGW issues that we've observed on a
> Pacific cluster over the past several weeks. The problems relate to versioned
> buckets and index entries that can be left behind after transactions complete
> abnormally. The scenario is multi-faceted and we're still investigating some of
> the details, but I wanted to provide a big-picture summary of what we've found
> so far. It looks like most of these issues should be reproducible on versions
> before and after Pacific as well. I'll enumerate the individual issues below:
>
> 1. PUT requests during reshard of versioned bucket fail with 404 and leave
>     behind dark data
>
>     Tracker: https://urldefense.com/v3/__https://tracker.ceph.com/issues/61359__;!!J0dtj8f0ZRU!jQa1-QLVWrJY5uzYRRlQcHUBsz-SXCIgyDC6Z8QLZqhtBwtIscjRFfA5XKAZPydCLywOqLni4aWArnEHYw$
>
> 2. When bucket index ops are cancelled it can leave behind zombie index entries
>
>     This one was merged a few months ago and did make the v16.2.13 release, but
>     in our case we had billions of extra index entries by the time that we had
>     upgraded to the patched version.
>
>     Tracker: https://urldefense.com/v3/__https://tracker.ceph.com/issues/58673__;!!J0dtj8f0ZRU!jQa1-QLVWrJY5uzYRRlQcHUBsz-SXCIgyDC6Z8QLZqhtBwtIscjRFfA5XKAZPydCLywOqLni4aVmbj0i2g$
>
> 3. Issuing a delete for a key that already has a delete marker as the current
>     version leaves behind index entries and OLH objects
>
>     Note that the tracker's original description describes the problem a bit
>     differently, but I've clarified the nature of the issue in a comment.
>
>     Tracker: https://urldefense.com/v3/__https://tracker.ceph.com/issues/59663__;!!J0dtj8f0ZRU!jQa1-QLVWrJY5uzYRRlQcHUBsz-SXCIgyDC6Z8QLZqhtBwtIscjRFfA5XKAZPydCLywOqLni4aXUyUzWvQ$
>
> The extra index entries and OLH objects that are left behind due to these sorts
> of issues are obviously annoying in regards to the fact that they unnecessarily
> consume space, but we've found that they can also cause severe performance
> degradation for bucket listings, lifecycle processing, and other ops indirectly
> due to higher osd latencies.
>
> The reason for the performance impact is that bucket listing calls must
> repeatedly perform additional OSD ops until they find the requisite number
> of entries to return. The OSD cls method for bucket listing also does its own
> internal iteration for the same purpose. Since these entries are invalid, they
> are skipped. In the case that we observed, where some of our bucket indexes were
> filled with a sea of contiguous leftover entries, the process of continually
> iterating over and skipping invalid entries caused enormous read amplification.
> I believe that the following tracker is describing symptoms that are related to
> the same issue: https://urldefense.com/v3/__https://tracker.ceph.com/issues/59164__;!!J0dtj8f0ZRU!jQa1-QLVWrJY5uzYRRlQcHUBsz-SXCIgyDC6Z8QLZqhtBwtIscjRFfA5XKAZPydCLywOqLni4aU7OS3f7g$.
>
> Note that this can also cause LC processing to repeatedly fail in cases where
> there are enough contiguous invalid entries, since the OSD cls code eventually
> gives up and returns an error that isn't handled.
>
> The severity of these issues likely varies greatly based upon client behavior.
> If anyone has experienced similar problems, we'd love to hear about the nature
> of how they've manifested for you so that we can be more confident that we've
> plugged all of the holes.
>
> Thanks,
>
> Cory Snyder
> 11:11 Systems
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx

--
Best Regards,
Mark Nelson
Head of R&D (USA)

Clyso GmbH
p: +49 89 21552391 12
a: Loristraße 8 | 80335 München | Germany
w: https://urldefense.com/v3/__https://clyso.com__;!!J0dtj8f0ZRU!jQa1-QLVWrJY5uzYRRlQcHUBsz-SXCIgyDC6Z8QLZqhtBwtIscjRFfA5XKAZPydCLywOqLni4aUM65Ji8Q$ | e: mark.nelson@xxxxxxxxx

We are hiring: https://urldefense.com/v3/__https://www.clyso.com/jobs/__;!!J0dtj8f0ZRU!jQa1-QLVWrJY5uzYRRlQcHUBsz-SXCIgyDC6Z8QLZqhtBwtIscjRFfA5XKAZPydCLywOqLni4aV7IbfJ0A$
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx