Hi Abhishek > On 2 Nov 2020, at 14:54, Abhishek Lekshmanan <abhishek@xxxxxxxx> wrote: > > There isn't much in terms of code changes in the scheduler from > v15.2.4->5. Does the perf dump (`ceph daemon perf dump <client.rgw-name> > `) on RGW socket show any throttle counts? I know, I was wondering if this somehow might have an influence, but I’m likely wrong: https://github.com/ceph/ceph/commit/c43f71056322e1a149a444735bf65d80fec7a7ae <https://github.com/ceph/ceph/commit/c43f71056322e1a149a444735bf65d80fec7a7ae> As for the perf counters, I don’t see anything interesting. I dumped the current state, but I don’t know how interesting this is: https://gist.github.com/href/a42c30e001789f005e9aa748f6f858fc <https://gist.github.com/href/a42c30e001789f005e9aa748f6f858fc> At the moment we don’t see any errors, but I do already count 135 incomplete requests in the current log (out of 3 Million). This number is typical for most days, where we’ll see something like 150 such requests. Our working theory is that out of the 1024 maximum outstanding requests of the throttler, ~150 get lost every day to those incomplete requests, until our need for up to 400 requests per instance can no longer be met (first a few will be over the watermark, then more, then all). For those incomplete requests we know that the following line is executed, producing “starting new request”: https://github.com/ceph/ceph/blob/8f393c0fc1886a369d213d5e5791c10cb1591828/src/rgw/rgw_process.cc#L187 <https://github.com/ceph/ceph/blob/8f393c0fc1886a369d213d5e5791c10cb1591828/src/rgw/rgw_process.cc#L187> However, it never reaches “req done” in the same function: https://github.com/ceph/ceph/blob/master/src/rgw/rgw_process.cc#L350 <https://github.com/ceph/ceph/blob/master/src/rgw/rgw_process.cc#L350> That entry, and the “beast” entry is missing for those few requests. Cheers, Denis _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx