Re: [ceph-users] RGW blocking on large objects

Robert LeBlanc <robert@xxxxxxxxxxxxx> · Thu, 17 Oct 2019 13:00:28 -0700

On Thu, Oct 17, 2019 at 11:46 AM Casey Bodley <cbodley@xxxxxxxxxx> wrote:
>
>
> On 10/17/19 12:59 PM, Robert LeBlanc wrote:
> > On Thu, Oct 17, 2019 at 9:22 AM Casey Bodley <cbodley@xxxxxxxxxx> wrote:
> >
> >> With respect to this issue, civetweb and beast should behave the same.
> >> Both frontends have a large thread pool, and their calls to
> >> process_request() run synchronously (including blocking on rados
> >> requests) on a frontend thread. So once there are more concurrent client
> >> connections than there are frontend threads, new connections will block
> >> until there's a thread available to service them.
> > Okay, this really helps me understand what's going on here. Is there
> > plans to remove the synchronous calls and make them async or improve
> > this flow a bit?
>
> Absolutely yes, this work has been in progress for a long time now, and
> octopus does get a lot of concurrency here. Eventually, all of
> process_request() will be async-enabled, and we'll be able to run beast
> with a much smaller thread pool.

This is great news. Anything we can do to help in this effort as it is
very important for us?

> > Currently I'm seeing 1024 max concurrent ops and 512 thread pool. Does
> > this mean that on an equally distributed requests that one op could be
> > processing on the backend RADOS with another queued behind it waiting?
> > Is this done in round robin fashion so for 99% small io, a very long
> > RADOS request can get many IO blocked behind it because it is being
> > round-robin dispatched to the thread pool? (I assume the latter is
> > what I'm seeing).
> >
> > rgw_max_concurrent_requests                                1024
> > rgw_thread_pool_size                                       512
> >
> > If I match the two, do you think it would help prevent small IO from
> > being blocked by larger IO?
> rgw_max_concurrent_requests was added in support of the beast/async
> work, precisely because (post-Nautilus) the number of beast threads will
> no longer limit the number of concurrent requests. This variable is what
> throttles incoming requests to prevent radosgw's resource consumption
> from ballooning under heavy workload. And unlike the existing model
> where a request remains in the queue until a thread is ready to service
> it, any requests that exceed rgw_max_concurrent_requests will be
> rejected with '503 SlowDown' in s3 or '498 Rate Limited' in swift.
>
> With respect to prioritization, there isn't any by default but we do
> have a prototype request scheduler that uses dmclock to prioritize
> requests based on some hard-coded request classes. It's not especially
> useful in its current form, but we do have plans to further elaborate
> the classes and eventually pass the information down to osds for
> integrated QOS.
>
> As of nautilus, though, the thread pool size is the only effective knob
> you have.

Do you see any problems with running 2k-4k threads if we have the RAM to do so?

----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx