Re: Living with huge bucket sizes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Bryan, I just went through this myself, also on Hammer, as offline bucket index resharding was backported. I had three buckets with > 10 million objects each, one of them with 30 million. I was experiencing the typical blocked request issue during scrubs, when the placement group containing the bucket index got hit.

I solved it in two steps. First, I added an SSD only pool, and moved the bucket index to this new SSD pool. This is an on-line operation.

After that was complete I scheduled some downtime (we run a highly available consumer facing website), and made a plan to reshard the bucket indexes. I did some tests with buckets containing 100,000 test objects and found the performance to be satisfactory. Once my maintenance window hit and I stopped all access to RGW, I was able to reshard all my bucket indexes in 20 minutes.

I can't remember exact numbers, but I believe I did a 20+ million bucket in about 5 minutes. It was extremely fast, but again I had moved my bucket indexes to an SSD backed pool of fast enterprise SSDs (three hosts, one SSD per host, Samsung 3.84tb PM863a for what it's worth).

Once I finished this, all my ceph performance issues disappeared. I'll slowly upgrade my cluster with the end goal of moving to the more efficient bluestore, but I no longer feel the rush.

Last detail: I used 100 shards per bucket which seems to be a good compromise.


Cullen
 
Date: Fri, 9 Jun 2017 14:58:41 -0700
From: Yehuda Sadeh-Weinraub <yehuda@xxxxxxxxxx>
To: Dan van der Ster <dan@xxxxxxxxxxxxxx>
Cc: "ceph-users@xxxxxxxxxxxxxx" <ceph-users@xxxxxxxxxxxxxx>
Subject: Re: Living with huge bucket sizes
Message-ID:
        <CADRKj5SMdrA2FZW8W5VY0tzNi_21dAo7r9gWLevV9CMygkiuTw@mail.gmail.com>
Content-Type: text/plain; charset="UTF-8"

On Fri, Jun 9, 2017 at 2:21 AM, Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote:
> Hi Bryan,
>
> On Fri, Jun 9, 2017 at 1:55 AM, Bryan Stillwell <bstillwell@xxxxxxxxxxx> wrote:
>> This has come up quite a few times before, but since I was only working with
>> RBD before I didn't pay too close attention to the conversation.  I'm
>> looking
>> for the best way to handle existing clusters that have buckets with a large
>> number of objects (>20 million) in them.  The cluster I'm doing test on is
>> currently running hammer (0.94.10), so if things got better in jewel I would
>> love to hear about it!
>> ...
>> Has anyone found a good solution for this for existing large buckets?  I
>> know sharding is the solution going forward, but afaik it can't be done
>> on existing buckets yet (although the dynamic resharding work mentioned
>> on today's performance call sounds promising).
>
> I haven't tried it myself, but 0.94.10 should have the (offline)
> resharding feature. From the release notes:
>

Right. We did add automatic dynamic resharding to Luminous, but
offline resharding should be enough.


>> * In RADOS Gateway, it is now possible to reshard an existing bucket's index
>> using an off-line tool.
>>
>> Usage:
>>
>> $ radosgw-admin bucket reshard --bucket=<bucket_name> --num_shards=<num_shards>
>>
>> This will create a new linked bucket instance that points to the newly created
>> index objects. The old bucket instance still exists and currently it's up to
>> the user to manually remove the old bucket index objects. (Note that bucket
>> resharding currently requires that all IO (especially writes) to the specific
>> bucket is quiesced.)

Once resharding is done, use the radosgw-admin bi purge command to
remove the old bucket indexes.

Yehuda

>
> -- Dan
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


------------------------------

Subject: Digest Footer

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


------------------------------

End of ceph-users Digest, Vol 53, Issue 9
*****************************************

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux