Re: Radosgw scaling recommendation?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks everyone for the suggestions, playing with all three of the
tuning knobs mentioned has greatly increased the number of client
connections an instance can deal with.  We're still experimenting to
find the max values to saturate our hardware.

With values as below we'd see something around 50 reqs/s and at higher
rates start to see some 403 responses or TCP peer resets.  However
we're still only hitting 3-5% utilization of the hardware CPU and
plenty of headroom with other resources so there's room to go higher I
think.  There wasn't a lot of thought put into those numbers or the
relation between them, just 'bigger'.

My analysis is that connection resets are likely due to too few
civetweb threads to handle requests, and 403 responses from too few
threads/handles to handle connections that do get through.

rgw_thread_pool_size= 800,
civetweb num_threads = 400
rgw_num_rados_handles = 8

regards,
Ben

On Thu, Feb 9, 2017 at 4:48 PM, Ben Hines <bhines@xxxxxxxxx> wrote:
> I'm curious how does the num_threads option to civetweb relate to the 'rgw
> thread pool size'?  Should i make them equal?
>
> ie:
>
> rgw frontends = civetweb enable_keep_alive=yes port=80 num_threads=125
> error_log_file=/var/log/ceph/civetweb.error.log
> access_log_file=/var/log/ceph/civetweb.access.log
>
>
> -Ben
>
> On Thu, Feb 9, 2017 at 12:30 PM, Wido den Hollander <wido@xxxxxxxx> wrote:
>>
>>
>> > Op 9 februari 2017 om 19:34 schreef Mark Nelson <mnelson@xxxxxxxxxx>:
>> >
>> >
>> > I'm not really an RGW expert, but I'd suggest increasing the
>> > "rgw_thread_pool_size" option to something much higher than the default
>> > 100 threads if you haven't already.  RGW requires at least 1 thread per
>> > client connection, so with many concurrent connections some of them
>> > might end up timing out.  You can scale the number of threads and even
>> > the number of RGW instances on a single server, but at some point you'll
>> > run out of threads at the OS level.  Probably before that actually
>> > happens though, you'll want to think about multiple RGW gateway nodes
>> > behind a load balancer.  Afaik that's how the big sites do it.
>> >
>>
>> In addition, have you tried to use more RADOS handles?
>>
>> rgw_num_rados_handles = 8
>>
>> That with more RGW threads as Mark mentioned.
>>
>> Wido
>>
>> > I believe some folks are considering trying to migrate rgw to a
>> > threadpool/event processing model but it sounds like it would be quite a
>> > bit of work.
>> >
>> > Mark
>> >
>> > On 02/09/2017 12:25 PM, Benjeman Meekhof wrote:
>> > > Hi all,
>> > >
>> > > We're doing some stress testing with clients hitting our rados gw
>> > > nodes with simultaneous connections.  When the number of client
>> > > connections exceeds about 5400 we start seeing 403 forbidden errors
>> > > and log messages like the following:
>> > >
>> > > 2017-02-09 08:53:16.915536 7f8c667bc700 0 NOTICE: request time skew
>> > > too big now=2017-02-09 08:53:16.000000 req_time=2017-02-09
>> > > 08:37:18.000000
>> > >
>> > > This is version 10.2.5 using embedded civetweb.  There's just one
>> > > instance per node, and they all start generating 403 errors and the
>> > > above log messages when enough clients start hitting them.  The
>> > > hardware is not being taxed at all, negligible load and network
>> > > throughput.   OSD don't show any appreciable increase in CPU load or
>> > > io wait on journal/data devices.  Unless I'm missing something it
>> > > looks like the RGW is just not scaling to fill out the hardware it is
>> > > on.
>> > >
>> > > Does anyone have advice on scaling RGW to fully utilize a host?
>> > >
>> > > thanks,
>> > > Ben
>> > > _______________________________________________
>> > > ceph-users mailing list
>> > > ceph-users@xxxxxxxxxxxxxx
>> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> > >
>> > _______________________________________________
>> > ceph-users mailing list
>> > ceph-users@xxxxxxxxxxxxxx
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux