Re: Performance issues with small files

Bryan Stillwell <bstillwell@xxxxxxxxxxxxxxx> · Thu, 5 Sep 2013 10:59:20 -0600

Mark,

Yesterday I blew away all the objects and restarted my test using
multiple buckets, and things are definitely better!

After ~20 hours I've already uploaded ~3.5 million objects, which much
is better then the ~1.5 million I did over ~96 hours this past
weekend.  Unfortunately it seems that things have slowed down a bit.
The average upload rate over those first 20 hours was ~48
objects/second, but now I'm only seeing ~20 objects/second.  This is
with 18,836 buckets.

Bryan

On Wed, Sep 4, 2013 at 12:43 PM, Bryan Stillwell
<bstillwell@xxxxxxxxxxxxxxx> wrote:
> So far I haven't seen much of a change.  It's still working through removing
> the bucket that reached 1.5 million objects though (my guess is that'll take
> a few more days), so I believe that might have something to do with it.
>
> Bryan
>
>
> On Wed, Sep 4, 2013 at 12:14 PM, Mark Nelson <mark.nelson@xxxxxxxxxxx>
> wrote:
>>
>> Bryan,
>>
>> Good explanation.  How's performance now that you've spread the load over
>> multiple buckets?
>>
>> Mark
>>
>> On 09/04/2013 12:39 PM, Bryan Stillwell wrote:
>>>
>>> Bill,
>>>
>>> I've run into a similar issue with objects averaging ~100KiB.  The
>>> explanation I received on IRC is that there are scaling issues if you're
>>> uploading them all to the same bucket because the index isn't sharded.
>>>   The recommended solution is to spread the objects out to a lot of
>>> buckets.  However, that ran me into another issue once I hit 1000
>>> buckets which is a per user limit.  I switched the limit to be unlimited
>>> with this command:
>>>
>>> radosgw-admin user modify --uid=your_username --max-buckets=0
>>>
>>> Bryan
>>>
>>>
>>> On Wed, Sep 4, 2013 at 11:27 AM, Bill Omer <bill.omer@xxxxxxxxx
>>> <mailto:bill.omer@xxxxxxxxx>> wrote:
>>>
>>>     I'm testing ceph for storing a very large number of small files.
>>>       I'm seeing some performance issues and would like to see if anyone
>>>     could offer any insight as to what I could do to correct this.
>>>
>>>     Some numbers:
>>>
>>>     Uploaded 184111 files, with an average file size of 5KB, using
>>>     10 separate servers to upload the request using Python and the
>>>     cloudfiles module.  I stopped uploading after 53 minutes, which
>>>     seems to average 5.7 files per second per node.
>>>
>>>
>>>     My storage cluster consists of 21 OSD's across 7 servers, with their
>>>     journals written to SSD drives.  I've done a default installation,
>>>     using ceph-deploy with the dumpling release.
>>>
>>>     I'm using statsd to monitor the performance, and what's interesting
>>>     is when I start with an empty bucket, performance is amazing, with
>>>     average response times of 20-50ms.  However as time goes on, the
>>>     response times go in to the hundreds, and the average number of
>>>     uploads per second drops.
>>>
>>>     I've installed radosgw on all 7 ceph servers.  I've tested using a
>>>     load balancer to distribute the api calls, as well as pointing the
>>>     10 worker servers to a single instance.  I've not seen a real
>>>     different in performance with this either.
>>>
>>>
>>>     Each of the ceph servers are 16 core Xeon 2.53GHz with 72GB of ram,
>>>     OCZ Vertex4 SSD drives for the journals and Seagate Barracuda ES2
>>>     drives for storage.
>>>
>>>
>>>     Any help would be greatly appreciated.
>>>
>>>
>>>     _______________________________________________
>>>     ceph-users mailing list
>>>     ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
>>>     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com