Re: rgw index shard much larger than others

Dan van der Ster <dan@xxxxxxxxxxxxxx> · Thu, 1 Oct 2020 16:53:59 +0200

Thanks Matt and Eric,

Sorry for the basic question, but how can I as a ceph operator tell if
a bucket is versioned?

And for fixing this current situation, I would wait for the fix then reshard?
(We want to reshard this bucket anyway because listing perf is way too
slow for the user with 512 shards).

-- Dan

On Thu, Oct 1, 2020 at 4:36 PM Eric Ivancich <ivancich@xxxxxxxxxx> wrote:
>
> Hi Matt and Dan,
>
> I too suspect it’s the issue Matt linked to. That bug only affects versioned buckets, so I’m guessing your bucket is versioned, Dan.
>
> This bug is triggered when the final instance of an object in a versioned bucket is deleted, but for reasons we do not yet understand, the object was not fully deleted from the bucket index. And then a reshard moves part of the object index to shard 0.
>
> Upgrading to a version that included Casey’s fix would mean this situation is not re-created in the future.
>
> An automated clean-up is non-trivial but feasible. It would have to take into account that an object with the same name as the previously deleted one was re-created in the versioned bucket.
>
> Eric
>
> > On Oct 1, 2020, at 8:46 AM, Matt Benjamin <mbenjami@xxxxxxxxxx> wrote:
> >
> > Hi Dan,
> >
> > Possibly you're reproducing https://tracker.ceph.com/issues/46456.
> >
> > That explains how the underlying issue worked, I don't remember how a
> > bucked exhibiting this is repaired.
> >
> > Eric?
> >
> > Matt
> >
> >
> > On Thu, Oct 1, 2020 at 8:41 AM Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote:
> >>
> >> Dear friends,
> >>
> >> Running 14.2.11, we have one particularly large bucket with a very
> >> strange distribution of objects among the shards. The bucket has 512
> >> shards, and most shards have ~75k entries, but shard 0 has 1.75M
> >> entries:
> >>
> >> # rados -p default.rgw.buckets.index listomapkeys
> >> .dir.61c59385-085d-4caa-9070-63a3868dccb6.272652427.1.0 | wc -l
> >> 1752085
> >>
> >> # rados -p default.rgw.buckets.index listomapkeys
> >> .dir.61c59385-085d-4caa-9070-63a3868dccb6.272652427.1.1 | wc -l
> >> 78388
> >>
> >> # rados -p default.rgw.buckets.index listomapkeys
> >> .dir.61c59385-085d-4caa-9070-63a3868dccb6.272652427.1.2 | wc -l
> >> 78764
> >>
> >> We had resharded this bucket (manually) from 32 up to 512 shards just
> >> before upgrading from 12.2.12 to 14.2.11 a couple weeks ago.
> >>
> >> Any idea why shard .0 is getting such an imbalance of entries?
> >> Should we manually reshard this bucket again?
> >>
> >> Thanks!
> >>
> >> Dan
> >> _______________________________________________
> >> ceph-users mailing list -- ceph-users@xxxxxxx
> >> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >>
> >>
> >
> >
> > --
> >
> > Matt Benjamin
> > Red Hat, Inc.
> > 315 West Huron Street, Suite 140A
> > Ann Arbor, Michigan 48103
> >
> > http://www.redhat.com/en/technologies/storage
> >
> > tel.  734-821-5101
> > fax.  734-769-8938
> > cel.  734-216-5309
> >
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx