Re: Performance issues with small files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 09/05/2013 09:19 AM, Bill Omer wrote:
Thats correct.  We created 65k buckets, using two hex characters as the
naming convention, then stored the files in each container based on
their first two characters in the file name.  The end result was 20-50
files per bucket.  Once all of the buckets were created and files were
being loaded, we still observed an increase in latency overtime.

Is there a way to disable indexing?  Or are there other settings you can
suggest to attempt to speed this process up?

There's been some talk recently about indexless buckets, but I don't think it's possible right now. Yehuda can probably talk about it.

If you remove objects from the bucket so it is empty does it speed up again? Anything you can tell us about when and how it slows down would be very useful!

Mark



On Wed, Sep 4, 2013 at 5:21 PM, Mark Nelson <mark.nelson@xxxxxxxxxxx
<mailto:mark.nelson@xxxxxxxxxxx>> wrote:

    Just for clarification, distributing objects over lots of buckets
    isn't helping improve small object performance?

    The degradation over time is similar to something I've seen in the
    past, with higher numbers of seeks on the underlying OSD device over
    time.  Is it always (temporarily) resolved writing to a new empty
    bucket?

    Mark


    On 09/04/2013 02:45 PM, Bill Omer wrote:

        We've actually done the same thing, creating 65k buckets and storing
        20-50 objects in each.  No change really, not noticeable anyway


        On Wed, Sep 4, 2013 at 2:43 PM, Bryan Stillwell
        <bstillwell@xxxxxxxxxxxxxxx <mailto:bstillwell@xxxxxxxxxxxxxxx>
        <mailto:bstillwell@__photobucket.com
        <mailto:bstillwell@xxxxxxxxxxxxxxx>>> wrote:

             So far I haven't seen much of a change.  It's still working
        through
             removing the bucket that reached 1.5 million objects though (my
             guess is that'll take a few more days), so I believe that
        might have
             something to do with it.

             Bryan


             On Wed, Sep 4, 2013 at 12:14 PM, Mark Nelson
             <mark.nelson@xxxxxxxxxxx <mailto:mark.nelson@xxxxxxxxxxx>
        <mailto:mark.nelson@inktank.__com
        <mailto:mark.nelson@xxxxxxxxxxx>>> wrote:

                 Bryan,

                 Good explanation.  How's performance now that you've
        spread the
                 load over multiple buckets?

                 Mark

                 On 09/04/2013 12:39 PM, Bryan Stillwell wrote:

                     Bill,

                     I've run into a similar issue with objects averaging
                     ~100KiB.  The
                     explanation I received on IRC is that there are scaling
                     issues if you're
                     uploading them all to the same bucket because the index
                     isn't sharded.
                        The recommended solution is to spread the
        objects out to
                     a lot of
                     buckets.  However, that ran me into another issue
        once I hit
                     1000
                     buckets which is a per user limit.  I switched the
        limit to
                     be unlimited
                     with this command:

                     radosgw-admin user modify --uid=your_username
        --max-buckets=0

                     Bryan


                     On Wed, Sep 4, 2013 at 11:27 AM, Bill Omer
                     <bill.omer@xxxxxxxxx <mailto:bill.omer@xxxxxxxxx>
        <mailto:bill.omer@xxxxxxxxx <mailto:bill.omer@xxxxxxxxx>>
                     <mailto:bill.omer@xxxxxxxxx
        <mailto:bill.omer@xxxxxxxxx> <mailto:bill.omer@xxxxxxxxx
        <mailto:bill.omer@xxxxxxxxx>>>>

                     wrote:

                          I'm testing ceph for storing a very large
        number of
                     small files.
                            I'm seeing some performance issues and would
        like to
                     see if anyone
                          could offer any insight as to what I could do to
                     correct this.

                          Some numbers:

                          Uploaded 184111 files, with an average file
        size of
                     5KB, using
                          10 separate servers to upload the request
        using Python
                     and the
                          cloudfiles module.  I stopped uploading after 53
                     minutes, which
                          seems to average 5.7 files per second per node.


                          My storage cluster consists of 21 OSD's across 7
                     servers, with their
                          journals written to SSD drives.  I've done a
        default
                     installation,
                          using ceph-deploy with the dumpling release.

                          I'm using statsd to monitor the performance,
        and what's
                     interesting
                          is when I start with an empty bucket,
        performance is
                     amazing, with
                          average response times of 20-50ms.  However as
        time
                     goes on, the
                          response times go in to the hundreds, and the
        average
                     number of
                          uploads per second drops.

                          I've installed radosgw on all 7 ceph servers.
          I've
                     tested using a
                          load balancer to distribute the api calls, as
        well as
                     pointing the
                          10 worker servers to a single instance.  I've
        not seen
                     a real
                          different in performance with this either.


                          Each of the ceph servers are 16 core Xeon
        2.53GHz with
                     72GB of ram,
                          OCZ Vertex4 SSD drives for the journals and
        Seagate
                     Barracuda ES2
                          drives for storage.


                          Any help would be greatly appreciated.



          ___________________________________________________

                          ceph-users mailing list
        ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
        <mailto:ceph-users@xxxxxxxxxx.__com
        <mailto:ceph-users@xxxxxxxxxxxxxx>>
                     <mailto:ceph-users@xxxxxxxxxx.
        <mailto:ceph-users@xxxxxxxxxx.>____com
                     <mailto:ceph-users@xxxxxxxxxx.__com
        <mailto:ceph-users@xxxxxxxxxxxxxx>>>
        http://lists.ceph.com/____listinfo.cgi/ceph-users-ceph.____com
        <http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com>


        <http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
        <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>>




                     --
                     Photobucket <http://photobucket.com>

                     *Bryan Stillwell*
                     SENIOR SYSTEM ADMINISTRATOR

                     E: bstillwell@xxxxxxxxxxxxxxx
        <mailto:bstillwell@xxxxxxxxxxxxxxx>
                     <mailto:bstillwell@__photobucket.com
        <mailto:bstillwell@xxxxxxxxxxxxxxx>>
                     <mailto:bstillwell@
        <mailto:bstillwell@>__photobuck__et.com <http://photobucket.com>
                     <mailto:bstillwell@__photobucket.com
        <mailto:bstillwell@xxxxxxxxxxxxxxx>>>
                     O: 303.228.5109 <tel:303.228.5109>
        <tel:303.228.5109 <tel:303.228.5109>>
                     M: 970.310.6085 <tel:970.310.6085>
        <tel:970.310.6085 <tel:970.310.6085>>

                     Facebook <http://www.facebook.com/____photobucket
        <http://www.facebook.com/__photobucket>
                     <http://www.facebook.com/__photobucket
        <http://www.facebook.com/photobucket>>>  Twitter
                     <http://twitter.com/____photobucket
        <http://twitter.com/__photobucket>
                     <http://twitter.com/__photobucket
        <http://twitter.com/photobucket>>>        Photobucket
                     <http://photobucket.com/____images/photobucket
        <http://photobucket.com/__images/photobucket>
                     <http://photobucket.com/__images/photobucket
        <http://photobucket.com/images/photobucket>>>




                     ___________________________________________________

                     ceph-users mailing list
        ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
        <mailto:ceph-users@xxxxxxxxxx.__com
        <mailto:ceph-users@xxxxxxxxxxxxxx>>
        http://lists.ceph.com/____listinfo.cgi/ceph-users-ceph.____com
        <http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com>

        <http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
        <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>>


                 ___________________________________________________

                 ceph-users mailing list
        ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
        <mailto:ceph-users@xxxxxxxxxx.__com
        <mailto:ceph-users@xxxxxxxxxxxxxx>>
        http://lists.ceph.com/____listinfo.cgi/ceph-users-ceph.____com
        <http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com>


        <http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
        <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>>




             --
             Photobucket <http://photobucket.com>

             *Bryan Stillwell*

             SENIOR SYSTEM ADMINISTRATOR

             E: bstillwell@xxxxxxxxxxxxxxx
        <mailto:bstillwell@xxxxxxxxxxxxxxx>
        <mailto:bstillwell@__photobucket.com
        <mailto:bstillwell@xxxxxxxxxxxxxxx>>
             O: 303.228.5109 <tel:303.228.5109> <tel:303.228.5109
        <tel:303.228.5109>>
             M: 970.310.6085 <tel:970.310.6085> <tel:970.310.6085
        <tel:970.310.6085>>


             Facebook <http://www.facebook.com/__photobucket
        <http://www.facebook.com/photobucket>>      Twitter
             <http://twitter.com/__photobucket
        <http://twitter.com/photobucket>>    Photobucket
             <http://photobucket.com/__images/photobucket
        <http://photobucket.com/images/photobucket>>



             _________________________________________________
             ceph-users mailing list
        ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
        <mailto:ceph-users@xxxxxxxxxx.__com
        <mailto:ceph-users@xxxxxxxxxxxxxx>>
        http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
        <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>





_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux