On Thu, Oct 17, 2019 at 2:50 AM Paul Emmerich <paul.emmerich@xxxxxxxx> wrote: > > On Thu, Oct 17, 2019 at 12:17 AM Robert LeBlanc <robert@xxxxxxxxxxxxx> wrote: > > > > On Wed, Oct 16, 2019 at 2:50 PM Paul Emmerich <paul.emmerich@xxxxxxxx> wrote: > > > > > > On Wed, Oct 16, 2019 at 11:23 PM Robert LeBlanc <robert@xxxxxxxxxxxxx> wrote: > > > > > > > > On Tue, Oct 15, 2019 at 8:05 AM Robert LeBlanc <robert@xxxxxxxxxxxxx> wrote: > > > > > > > > > > On Mon, Oct 14, 2019 at 2:58 PM Paul Emmerich <paul.emmerich@xxxxxxxx> wrote: > > > > > > > > > > > > Could the 4 GB GET limit saturate the connection from rgw to Ceph? > > > > > > Simple to test: just rate-limit the health check GET > > > > > > > > > > I don't think so, we have dual 25Gbp in a LAG, so Ceph to RGW has > > > > > multiple paths, but we aren't balancing on port yet, so RGW to HAProxy > > > > > is probably limited to one link. > > > > > > > > > > > Did you increase "objecter inflight ops" and "objecter inflight op bytes"? > > > > > > You absolutely should adjust these settings for large RGW setups, > > > > > > defaults of 1024 and 100 MB are way too low for many RGW setups, we > > > > > > default to 8192 and 800MB > > > > > > > > On Nautilus the defaults already seem to be: > > > > objecter_inflight_op_bytes 104857600 > > > > default > > > = 100MiB > > > > > > > objecter_inflight_ops 24576 > > > > default > > > > > > not sure where you got this from, but the default is still 1024 even > > > in master: https://github.com/ceph/ceph/blob/4774808cb2923f65f6919fe8be5f98917075cdd7/src/common/options.cc#L2288 > > > > Looks like it is overridden in > > https://github.com/ceph/ceph/blob/4774808cb2923f65f6919fe8be5f98917075cdd7/src/rgw/rgw_main.cc#L187 > > you are right, this is new in Nautilus. Last time I had to play around > with these settings was indeed on a Mimic deployment. > > > I'm just not > > understanding how your suggestions would help, the problem doesn't > > seem to be on the RADOS side (which it appears your tweaks target), > > but on the HTTP side as an HTTP health check takes a long time to come > > back when a big transfer is going on. > > I was guessing a bottleneck on the RADOS side because you mentioned > that you tried both civetweb and beast, somewhat unlikely to run into > the exact same problem with both Looping in ceph-dev in case they have some insights into the inner workings that may be helpful. >From what I understand civitweb was not async and beast is, but if beast is not coded exactly right, then it could behave similarly as civitweb. It seems that with beast incoming requests are being assigned to BEAST threads and possibly it is doing as sync call to rados therefore blocking requests behind it until the RADOS call is completed. I tried looking through the code, but I'm not familiar with async in C++. I could see two options that may resolve this. First, have a seperate thread pool for accessing RADOS objects with a queue that BEAST dispatches to and callback the completion at the end. The second option is creating async RADOS calls so that it can yield the event loop to another RADOS task. I couldn't tell if either one of these are being done, but that should help small IO not get stuck behind large IO. ---------------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx