Re: Living with huge bucket sizes

Cullen King <cullen@xxxxxxxxxxxxxxx> · Sat, 10 Jun 2017 21:10:08 -0700

Bryan, I just went through this myself, also on Hammer, as offline bucket index resharding was backported. I had three buckets with > 10 million objects each, one of them with 30 million. I was experiencing the typical blocked request issue during scrubs, when the placement group containing the bucket index got hit.

I solved it in two steps. First, I added an SSD only pool, and moved the bucket index to this new SSD pool. This is an on-line operation.

After that was complete I scheduled some downtime (we run a highly available consumer facing website), and made a plan to reshard the bucket indexes. I did some tests with buckets containing 100,000 test objects and found the performance to be satisfactory. Once my maintenance window hit and I stopped all access to RGW, I was able to reshard all my bucket indexes in 20 minutes.

I can't remember exact numbers, but I believe I did a 20+ million bucket in about 5 minutes. It was extremely fast, but again I had moved my bucket indexes to an SSD backed pool of fast enterprise SSDs (three hosts, one SSD per host, Samsung 3.84tb PM863a for what it's worth).

Once I finished this, all my ceph performance issues disappeared. I'll slowly upgrade my cluster with the end goal of moving to the more efficient bluestore, but I no longer feel the rush.

Last detail: I used 100 shards per bucket which seems to be a good compromise.

Cullen

Date: Fri, 9 Jun 2017 14:58:41 -0700

From: Yehuda Sadeh-Weinraub <yehuda@xxxxxxxxxx>

To: Dan van der Ster <dan@xxxxxxxxxxxxxx>

Cc: "ceph-users@xxxxxxxxxxxxxx" <ceph-users@xxxxxxxxxxxxxx>

Subject: Re:  Living with huge bucket sizes

Message-ID:

        <CADRKj5SMdrA2FZW8W5VY0tzNi_21dAo7r9gWLevV9CMygkiuTw@mail.gmail.com>

Content-Type: text/plain; charset="UTF-8"

On Fri, Jun 9, 2017 at 2:21 AM, Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote:

> Hi Bryan,

>

> On Fri, Jun 9, 2017 at 1:55 AM, Bryan Stillwell <bstillwell@xxxxxxxxxxx> wrote:

>> This has come up quite a few times before, but since I was only working with

>> RBD before I didn't pay too close attention to the conversation.  I'm

>> looking

>> for the best way to handle existing clusters that have buckets with a large

>> number of objects (>20 million) in them.  The cluster I'm doing test on is

>> currently running hammer (0.94.10), so if things got better in jewel I would

>> love to hear about it!

>> ...

>> Has anyone found a good solution for this for existing large buckets?  I

>> know sharding is the solution going forward, but afaik it can't be done

>> on existing buckets yet (although the dynamic resharding work mentioned

>> on today's performance call sounds promising).

>

> I haven't tried it myself, but 0.94.10 should have the (offline)

> resharding feature. From the release notes:

>

Right. We did add automatic dynamic resharding to Luminous, but

offline resharding should be enough.

>> * In RADOS Gateway, it is now possible to reshard an existing bucket's index

>> using an off-line tool.

>>

>> Usage:

>>

>> $ radosgw-admin bucket reshard --bucket=<bucket_name> --num_shards=<num_shards>

>>

>> This will create a new linked bucket instance that points to the newly created

>> index objects. The old bucket instance still exists and currently it's up to

>> the user to manually remove the old bucket index objects. (Note that bucket

>> resharding currently requires that all IO (especially writes) to the specific

>> bucket is quiesced.)

Once resharding is done, use the radosgw-admin bi purge command to

remove the old bucket indexes.

Yehuda

>

> -- Dan

> _______________________________________________

> ceph-users mailing list

> ceph-users@xxxxxxxxxxxxxx

> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

------------------------------

Subject: Digest Footer

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

------------------------------

End of ceph-users Digest, Vol 53, Issue 9

*****************************************

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com