Re: Performance issues with small files

Yehuda Sadeh <yehuda@xxxxxxxxxxx> · Thu, 5 Sep 2013 10:14:45 -0700

On Thu, Sep 5, 2013 at 9:49 AM, Sage Weil <sage@xxxxxxxxxxx> wrote:
> On Thu, 5 Sep 2013, Bill Omer wrote:
>> Thats correct.  We created 65k buckets, using two hex characters as the
>> naming convention, then stored the files in each container based on their
>> first two characters in the file name.  The end result was 20-50 files per
>> bucket.  Once all of the buckets were created and files were being loaded,
>> we still observed an increase in latency overtime.
>
> This might be going too far in the opposite direction.  I would target
> 1000's of objects per bucket, not 10's.  The radosgw has to validate
> bucket ACLs on requests.  It caches them, but it probably can't cache 64K
> of them (not by default at least!).  And even if it can, it will take a
> long long time for the cache to warm up.  In any case, the end result is
> that there is probably an extra rados request going on on the backend for
> every request.
>
> Maybe try over ~1000 buckets and see how that goes?  And give the cache a
> bit of time to warm up?

There's actually a configurable that can be played with. Try setting
something like this in your ceph.conf:

rgw cache lru size = 100000

That is 10 times the default 10k.

Also, I don't remember if the obvious has been stated, but how many
pgs do you have on your data and index pools?

Yehuda

>
> sage
>
>
>
>> Is there a way to disable indexing?  Or are there other settings you can
>> suggest to attempt to speed this process up?
>>
>>
>> On Wed, Sep 4, 2013 at 5:21 PM, Mark Nelson <mark.nelson@xxxxxxxxxxx> wrote:
>>       Just for clarification, distributing objects over lots of
>>       buckets isn't helping improve small object performance?
>>
>>       The degradation over time is similar to something I've seen in
>>       the past, with higher numbers of seeks on the underlying OSD
>>       device over time.  Is it always (temporarily) resolved writing
>>       to a new empty bucket?
>>
>>       Mark
>>
>>       On 09/04/2013 02:45 PM, Bill Omer wrote:
>>       We've actually done the same thing, creating 65k buckets
>>       and storing
>>       20-50 objects in each.  No change really, not noticeable
>>       anyway
>>
>>
>>       On Wed, Sep 4, 2013 at 2:43 PM, Bryan Stillwell
>> <bstillwell@xxxxxxxxxxxxxxx <mailto:bstillwell@xxxxxxxxxxxxxxx>>
>> wrote:
>>
>>     So far I haven't seen much of a change.  It's still working
>> through
>>     removing the bucket that reached 1.5 million objects though
>> (my
>>     guess is that'll take a few more days), so I believe that
>> might have
>>     something to do with it.
>>
>>     Bryan
>>
>>
>>     On Wed, Sep 4, 2013 at 12:14 PM, Mark Nelson
>>     <mark.nelson@xxxxxxxxxxx <mailto:mark.nelson@xxxxxxxxxxx>>
>> wrote:
>>
>>         Bryan,
>>
>>         Good explanation.  How's performance now that you've
>> spread the
>>         load over multiple buckets?
>>
>>         Mark
>>
>>         On 09/04/2013 12:39 PM, Bryan Stillwell wrote:
>>
>>             Bill,
>>
>>             I've run into a similar issue with objects averaging
>>             ~100KiB.  The
>>             explanation I received on IRC is that there are
>> scaling
>>             issues if you're
>>             uploading them all to the same bucket because the
>> index
>>             isn't sharded.
>>                The recommended solution is to spread the objects
>> out to
>>             a lot of
>>             buckets.  However, that ran me into another issue
>> once I hit
>>             1000
>>             buckets which is a per user limit.  I switched the
>> limit to
>>             be unlimited
>>             with this command:
>>
>>             radosgw-admin user modify --uid=your_username
>> --max-buckets=0
>>
>>             Bryan
>>
>>
>>             On Wed, Sep 4, 2013 at 11:27 AM, Bill Omer
>>             <bill.omer@xxxxxxxxx <mailto:bill.omer@xxxxxxxxx>
>>             <mailto:bill.omer@xxxxxxxxx
>> <mailto:bill.omer@xxxxxxxxx>>>
>>             wrote:
>>
>>                  I'm testing ceph for storing a very large
>> number of
>>             small files.
>>                    I'm seeing some performance issues and would
>> like to
>>             see if anyone
>>                  could offer any insight as to what I could do
>> to
>>             correct this.
>>
>>                  Some numbers:
>>
>>                  Uploaded 184111 files, with an average file
>> size of
>>             5KB, using
>>                  10 separate servers to upload the request using
>> Python
>>             and the
>>                  cloudfiles module.  I stopped uploading after
>> 53
>>             minutes, which
>>                  seems to average 5.7 files per second per node.
>>
>>
>>                  My storage cluster consists of 21 OSD's across
>> 7
>>             servers, with their
>>                  journals written to SSD drives.  I've done a
>> default
>>             installation,
>>                  using ceph-deploy with the dumpling release.
>>
>>                  I'm using statsd to monitor the performance,
>> and what's
>>             interesting
>>                  is when I start with an empty bucket,
>> performance is
>>             amazing, with
>>                  average response times of 20-50ms.  However as
>> time
>>             goes on, the
>>                  response times go in to the hundreds, and the
>> average
>>             number of
>>                  uploads per second drops.
>>
>>                  I've installed radosgw on all 7 ceph servers.
>>  I've
>>             tested using a
>>                  load balancer to distribute the api calls, as
>> well as
>>             pointing the
>>                  10 worker servers to a single instance.  I've
>> not seen
>>             a real
>>                  different in performance with this either.
>>
>>
>>                  Each of the ceph servers are 16 core Xeon
>> 2.53GHz with
>>             72GB of ram,
>>                  OCZ Vertex4 SSD drives for the journals and
>> Seagate
>>             Barracuda ES2
>>                  drives for storage.
>>
>>
>>                  Any help would be greatly appreciated.
>>
>>
>>
>>  _________________________________________________
>>                  ceph-users mailing list
>>             ceph-users@xxxxxxxxxxxxxx
>> <mailto:ceph-users@xxxxxxxxxxxxxx>
>>             <mailto:ceph-users@xxxxxxxxxx.__com
>>             <mailto:ceph-users@xxxxxxxxxxxxxx>>
>>
>> http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
>>
>> <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
>>
>>
>>
>>
>>             --
>>             Photobucket <http://photobucket.com>
>>
>>             *Bryan Stillwell*
>>             SENIOR SYSTEM ADMINISTRATOR
>>
>>             E: bstillwell@xxxxxxxxxxxxxxx
>>             <mailto:bstillwell@xxxxxxxxxxxxxxx>
>>             <mailto:bstillwell@__photobucket.com
>>             <mailto:bstillwell@xxxxxxxxxxxxxxx>>
>>             O: 303.228.5109 <tel:303.228.5109>
>>             M: 970.310.6085 <tel:970.310.6085>
>>
>>             Facebook <http://www.facebook.com/__photobucket
>>             <http://www.facebook.com/photobucket>>  Twitter
>>             <http://twitter.com/__photobucket
>>             <http://twitter.com/photobucket>>        Photobucket
>>             <http://photobucket.com/__images/photobucket
>>             <http://photobucket.com/images/photobucket>>
>>
>>
>>
>>
>>             _________________________________________________
>>             ceph-users mailing list
>>             ceph-users@xxxxxxxxxxxxxx
>> <mailto:ceph-users@xxxxxxxxxxxxxx>
>>
>> http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
>>
>> <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
>>
>>
>>         _________________________________________________
>>         ceph-users mailing list
>>         ceph-users@xxxxxxxxxxxxxx
>> <mailto:ceph-users@xxxxxxxxxxxxxx>
>>
>> http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
>>         <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
>>
>>
>>
>>
>>     --
>>     Photobucket <http://photobucket.com>
>>
>>     *Bryan Stillwell*
>>
>>     SENIOR SYSTEM ADMINISTRATOR
>>
>>     E: bstillwell@xxxxxxxxxxxxxxx
>> <mailto:bstillwell@xxxxxxxxxxxxxxx>
>>     O: 303.228.5109 <tel:303.228.5109>
>>     M: 970.310.6085 <tel:970.310.6085>
>>
>>     Facebook <http://www.facebook.com/photobucket>      Twitter
>>     <http://twitter.com/photobucket>    Photobucket
>>     <http://photobucket.com/images/photobucket>
>>
>>
>>
>>     _______________________________________________
>>     ceph-users mailing list
>>     ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
>>     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>>
>>
>>
>>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com