On 2018-03-22 22:17, Yehuda Sadeh-Weinraub wrote:
On Thu, Mar 22, 2018 at 12:09 PM, Casey Bodley <cbodley@xxxxxxxxxx>
wrote:
One of the benefits of the asynchronous beast frontend in radosgw is
that it
allows us to do things like request throttling and priority queuing
that
would otherwise block frontend threads - which are a scarce resource
in
civetweb's thread-per-connection model.
The primary goal of this project is to prevent large object data
workloads
from starving out cheaper requests. After some discussion in the Ann
Arbor
office, our resident dmclock expert Eric Ivancich convinced us that
mclock
was a good fit. I've spent the week exploring a design for this, and
wanted
to get some early feedback:
Each HTTP request would be assigned a request class (dmclock calls
them
clients) and a cost.
The four initial request classes:
- auth: requests for swift auth tokens, and eventually sts
- admin: admin APIs for use by the dashboard and multisite sync
- data: object io
- metadata: everything else, such as bucket operations, object stat,
etc.
Calculating a cost is difficult, especially for the two major cases
where
we'd want it: object GET requests (because we have to check with RADOS
before we know its actual size), and object PUT requests that use
chunked
transfer-encoding. I'd love to hear ideas for this, but for now I
think it's
good enough to assign everything a cost of 1 so that all of the units
are in
requests/sec. I believe this is what the osd is doing now as well?
That does sound like the simpler solution that should be good enough
starting point. What if we could integrate it in a much lower layer,
e.g., into librados?
New virtual functions in class RGWOp seem like a good way for the
derived
Ops to return their request class and cost. Once we know those, we can
add
ourselves to the mclock priority queue and do an async wait until its
our
turn to run.
But where exactly does this step fit into the request processing
pipeline?
Does it happen before or after authentication/authorization? I'm
leaning
towards after, so that auth failures get filtered out before they
enter the
queue.
What about the situation where you have a bad actor flooding with
badly authenticated requests?
For non admin requests, maybe we could use the user parameter to
start increasing the cost associated with the user as more requests
start to
pile up (though this isn't strictly affected by before/after
authentication as we
populate the user info before that anyway)
The priority queue can use perf counters for introspection, and a
config
observer to apply changes to the per-client mclock options.
As future work, we could add some load balancer integration to:
- enable custom scripts that look at incoming requests and assign
their own
request class/cost
- track distributed client stats across gateways, and feed that info
back
into radosgw with each request (this is the d in dmclock)
Thanks,
Casey
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel"
in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel"
in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html