Re: rgw index shard much larger than others

Dan van der Ster <dan@xxxxxxxxxxxxxx> · Fri, 2 Oct 2020 10:02:08 +0200

Hi Eric,

So yes we're hit by this. We have around 1.6M entries in shard 0 with
an empty key, e.g.:

    {
        "type": "olh",
        "idx": "<80>1001_02/5f/025f8e0fc8234530d6ae7302adf682509f0f7fb68666391122e16d00bd7107e3/2018_11_14/2625203/3034777/metadata.gz",
        "entry": {
            "key": {
                "name": "",
                "instance": ""
            },
            "delete_marker": "false",
            "epoch": 11,
            "pending_log": [],
            "tag": "uhzz6da13ovbr69hhlttdjqmwic4f2v8",
            "exists": "false",
            "pending_removal": "true"
        }
    },

exists is false and pending_removal is true for all of them.

Cheers, Dan

On Thu, Oct 1, 2020 at 11:32 PM Eric Ivancich <ivancich@xxxxxxxxxx> wrote:
>
> Hi Dan,
>
> One way to tell would be to do a:
>
> radosgw-admin bi list —bucket=<bucket>
>
> And see if any of the lines output contains (perhaps using `grep`):
>
> "type": "olh",
>
> That would tell you if there were any versioned objects in the bucket.
>
> The “fix” we currently have only prevents this from happening in the future. We currently do not have a “fix” that cleans up the bucket index. Like I mentioned — an automated clean-up is non-trivial but feasible; it would have to take into account that an object with the same name as the previously deleted one was re-created in the versioned bucket.
>
> I hope that’s informative, if not what you were hoping to hear.
>
> Eric
> --
> J. Eric Ivancich
>
> he / him / his
> Red Hat Storage
> Ann Arbor, Michigan, USA
>
> On Oct 1, 2020, at 10:53 AM, Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote:
>
> Thanks Matt and Eric,
>
> Sorry for the basic question, but how can I as a ceph operator tell if
> a bucket is versioned?
>
> And for fixing this current situation, I would wait for the fix then reshard?
> (We want to reshard this bucket anyway because listing perf is way too
> slow for the user with 512 shards).
>
> -- Dan
>
>
> On Thu, Oct 1, 2020 at 4:36 PM Eric Ivancich <ivancich@xxxxxxxxxx> wrote:
>
>
> Hi Matt and Dan,
>
> I too suspect it’s the issue Matt linked to. That bug only affects versioned buckets, so I’m guessing your bucket is versioned, Dan.
>
> This bug is triggered when the final instance of an object in a versioned bucket is deleted, but for reasons we do not yet understand, the object was not fully deleted from the bucket index. And then a reshard moves part of the object index to shard 0.
>
> Upgrading to a version that included Casey’s fix would mean this situation is not re-created in the future.
>
> An automated clean-up is non-trivial but feasible. It would have to take into account that an object with the same name as the previously deleted one was re-created in the versioned bucket.
>
> Eric
>
> On Oct 1, 2020, at 8:46 AM, Matt Benjamin <mbenjami@xxxxxxxxxx> wrote:
>
> Hi Dan,
>
> Possibly you're reproducing https://tracker.ceph.com/issues/46456.
>
> That explains how the underlying issue worked, I don't remember how a
> bucked exhibiting this is repaired.
>
> Eric?
>
> Matt
>
>
> On Thu, Oct 1, 2020 at 8:41 AM Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote:
>
>
> Dear friends,
>
> Running 14.2.11, we have one particularly large bucket with a very
> strange distribution of objects among the shards. The bucket has 512
> shards, and most shards have ~75k entries, but shard 0 has 1.75M
> entries:
>
> # rados -p default.rgw.buckets.index listomapkeys
> .dir.61c59385-085d-4caa-9070-63a3868dccb6.272652427.1.0 | wc -l
> 1752085
>
> # rados -p default.rgw.buckets.index listomapkeys
> .dir.61c59385-085d-4caa-9070-63a3868dccb6.272652427.1.1 | wc -l
> 78388
>
> # rados -p default.rgw.buckets.index listomapkeys
> .dir.61c59385-085d-4caa-9070-63a3868dccb6.272652427.1.2 | wc -l
> 78764
>
> We had resharded this bucket (manually) from 32 up to 512 shards just
> before upgrading from 12.2.12 to 14.2.11 a couple weeks ago.
>
> Any idea why shard .0 is getting such an imbalance of entries?
> Should we manually reshard this bucket again?
>
> Thanks!
>
> Dan
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
>
>
>
> --
>
> Matt Benjamin
> Red Hat, Inc.
> 315 West Huron Street, Suite 140A
> Ann Arbor, Michigan 48103
>
> http://www.redhat.com/en/technologies/storage
>
> tel.  734-821-5101
> fax.  734-769-8938
> cel.  734-216-5309
>
>
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx