Re: mclock priority queue in radosgw

Yehuda Sadeh-Weinraub <ysadehwe@xxxxxxxxxx> · Thu, 22 Mar 2018 14:17:08 -0700

On Thu, Mar 22, 2018 at 12:09 PM, Casey Bodley <cbodley@xxxxxxxxxx> wrote:
> One of the benefits of the asynchronous beast frontend in radosgw is that it
> allows us to do things like request throttling and priority queuing that
> would otherwise block frontend threads - which are a scarce resource in
> civetweb's thread-per-connection model.
>
> The primary goal of this project is to prevent large object data workloads
> from starving out cheaper requests. After some discussion in the Ann Arbor
> office, our resident dmclock expert Eric Ivancich convinced us that mclock
> was a good fit. I've spent the week exploring a design for this, and wanted
> to get some early feedback:
>
> Each HTTP request would be assigned a request class (dmclock calls them
> clients) and a cost.
>
> The four initial request classes:
> - auth: requests for swift auth tokens, and eventually sts
> - admin: admin APIs for use by the dashboard and multisite sync
> - data: object io
> - metadata: everything else, such as bucket operations, object stat, etc.
>
> Calculating a cost is difficult, especially for the two major cases where
> we'd want it: object GET requests (because we have to check with RADOS
> before we know its actual size), and object PUT requests that use chunked
> transfer-encoding. I'd love to hear ideas for this, but for now I think it's
> good enough to assign everything a cost of 1 so that all of the units are in
> requests/sec. I believe this is what the osd is doing now as well?
>

That does sound like the simpler solution that should be good enough
starting point. What if we could integrate it in a much lower layer,
e.g., into librados?

> New virtual functions in class RGWOp seem like a good way for the derived
> Ops to return their request class and cost. Once we know those, we can add
> ourselves to the mclock priority queue and do an async wait until its our
> turn to run.
>
> But where exactly does this step fit into the request processing pipeline?
> Does it happen before or after authentication/authorization? I'm leaning
> towards after, so that auth failures get filtered out before they enter the
> queue.

What about the situation where you have a bad actor flooding with
badly authenticated requests?

>
> The priority queue can use perf counters for introspection, and a config
> observer to apply changes to the per-client mclock options.
>
> As future work, we could add some load balancer integration to:
> - enable custom scripts that look at incoming requests and assign their own
> request class/cost
> - track distributed client stats across gateways, and feed that info back
> into radosgw with each request (this is the d in dmclock)
>
> Thanks,
> Casey
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html