Re: have buckets with low number of shards

mahnoosh shahidi <mahnooosh.shd@xxxxxxxxx> · Tue, 23 Nov 2021 20:40:20 +0330

Hi Dominic,

Thanks for explanation but I didn't mean the bucket lock which happens
during the reshard. My problem is when number of objects in a bucket is
about 500M and more than that, deleting those old RADOS objects in the
reshard process, causes slow ops which results in osd failures so we
experience down time in the whole cluster not only in the resharded bucket.

Thanks

On Tue, Nov 23, 2021, 7:43 PM <DHilsbos@xxxxxxxxxxxxxx> wrote:

> Manoosh;
>
> You can't reshard a bucket without downtime.  During a reshard RGW creates
> new RADOS objects to match the new shard number.  Then all the RGW objects
> are moved from the old RADOS objects to the new RADOS objects, and the
> original RADOS objects are destroyed.  The reshard locks the bucket for the
> duration.
>
> Thank you,
>
> Dominic L. Hilsbos, MBA
> Vice President - Information Technology
> Perform Air International Inc.
> DHilsbos@xxxxxxxxxxxxxx
> www.PerformAir.com
>
>
> -----Original Message-----
> From: mahnoosh shahidi [mailto:mahnooosh.shd@xxxxxxxxx]
> Sent: Tuesday, November 23, 2021 8:20 AM
> To: Josh Baergen
> Cc: Ceph Users
> Subject:  Re: have buckets with low number of shards
>
> Hi Josh
>
> Thanks for your response. Do you have any advice how to reshard these big
> buckets so it doesn't cause any down time in our cluster? Resharding these
> buckets makes a lots of slow ops in deleting old shard phase and the
> cluster can't responde to any requests till resharding is completely done.
>
> Regards,
> Mahnoosh
>
> On Tue, Nov 23, 2021, 5:28 PM Josh Baergen <jbaergen@xxxxxxxxxxxxxxxx>
> wrote:
>
> > Hey Mahnoosh,
> >
> > > Running cluster in octopus 15.2.12 . We have a big bucket with about
> 800M
> > > objects and resharding this bucket makes many slow ops in our bucket
> > index
> > > osds. I wanna know what happens if I don't reshard this bucket any
> more?
> > > How does it affect the performance? The performance problem would be
> only
> > > for that bucket or it affects the entire bucket index pool?
> >
> > Unfortunately, if you don't reshard the bucket, it's likely that
> > you'll see widespread index pool performance and stability issues,
> > generally manifesting as one or more OSDs becoming very busy to the
> > point of holding up traffic for multiple buckets or even flapping (the
> > OSD briefly gets marked down), leading to recovery. Recovering large
> > index shards can itself cause issues like this to occur. Although the
> > official recommendation, IIRC, is 100K objects per index shard, the
> > exact objects per shard count at which one starts to experience these
> > sorts of issues highly depends on the hardware involved and user
> > workload.
> >
> > Josh
> >
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx