Re: RGW blocking on large objects

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 10/17/19 10:58 AM, Robert LeBlanc wrote:
On Thu, Oct 17, 2019 at 2:50 AM Paul Emmerich <paul.emmerich@xxxxxxxx> wrote:
On Thu, Oct 17, 2019 at 12:17 AM Robert LeBlanc <robert@xxxxxxxxxxxxx> wrote:
On Wed, Oct 16, 2019 at 2:50 PM Paul Emmerich <paul.emmerich@xxxxxxxx> wrote:
On Wed, Oct 16, 2019 at 11:23 PM Robert LeBlanc <robert@xxxxxxxxxxxxx> wrote:
On Tue, Oct 15, 2019 at 8:05 AM Robert LeBlanc <robert@xxxxxxxxxxxxx> wrote:
On Mon, Oct 14, 2019 at 2:58 PM Paul Emmerich <paul.emmerich@xxxxxxxx> wrote:
Could the 4 GB GET limit saturate the connection from rgw to Ceph?
Simple to test: just rate-limit the health check GET
I don't think so, we have dual 25Gbp in a LAG, so Ceph to RGW has
multiple paths, but we aren't balancing on port yet, so RGW to HAProxy
is probably limited to one link.

Did you increase "objecter inflight ops" and "objecter inflight op bytes"?
You absolutely should adjust these settings for large RGW setups,
defaults of 1024 and 100 MB are way too low for many RGW setups, we
default to 8192 and 800MB
On Nautilus the defaults already seem to be:
objecter_inflight_op_bytes                                 104857600
                       default
= 100MiB

objecter_inflight_ops                                      24576
                       default
not sure where you got this from, but the default is still 1024 even
in master: https://github.com/ceph/ceph/blob/4774808cb2923f65f6919fe8be5f98917075cdd7/src/common/options.cc#L2288
Looks like it is overridden in
https://github.com/ceph/ceph/blob/4774808cb2923f65f6919fe8be5f98917075cdd7/src/rgw/rgw_main.cc#L187
you are right, this is new in Nautilus. Last time I had to play around
with these settings was indeed on a Mimic deployment.

I'm just not
understanding how your suggestions would help, the problem doesn't
seem to be on the RADOS side (which it appears your tweaks target),
but on the HTTP side as an HTTP health check takes a long time to come
back when a big transfer is going on.
I was guessing a bottleneck on the RADOS side because you mentioned
that you tried both civetweb and beast, somewhat unlikely to run into
the exact same problem with both
Looping in ceph-dev in case they have some insights into the inner
workings that may be helpful.

 From what I understand civitweb was not async and beast is, but if
beast is not coded exactly right, then it could behave similarly as
civitweb.

With respect to this issue, civetweb and beast should behave the same. Both frontends have a large thread pool, and their calls to process_request() run synchronously (including blocking on rados requests) on a frontend thread. So once there are more concurrent client connections than there are frontend threads, new connections will block until there's a thread available to service them.


It seems that with beast incoming requests are being assigned to BEAST
threads and possibly it is doing as sync call to rados therefore
blocking requests behind it until the RADOS call is completed. I tried
looking through the code, but I'm not familiar with async in C++. I
could see two options that may resolve this. First, have a seperate
thread pool for accessing RADOS objects with a queue that BEAST
dispatches to and callback the completion at the end. The second
option is creating async RADOS calls so that it can yield the event
loop to another RADOS task. I couldn't tell if either one of these are
being done, but that should help small IO not get stuck behind large
IO.

----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux