Re: Per-Client Quality of Service settings

Olaf Seibert <o.seibert@xxxxxxxxxxxx> · Fri, 10 Jan 2025 16:38:25 +0100

Thanks for your reply, Anthony.

On 10.01.25 15:27, Anthony D'Atri wrote:
That link

https://docs.ceph.com/en/reef/rbd/rbd-config-ref/#qos-settings
does have a section that describes per-image (volume) settings, which you should be able to enforce on the OpenStack side.  OpenStack / libvirt do have their own IOPS and throughput throttles you can find in their docs.  This is important to do so that instances don’t DoS each other or the entire Nova node.  Be sure to increase the system-wide file limit on those nodes, to like 4 million.  Each RBD attachment needs two sockets to *each* OSD node for each attachment.

I think I found that one already.

For RGW you can do rate limiting:

https://docs.ceph.com/en/latest/radosgw/adminops/#rate-limit

Which I *think* is per-RGW granularity.

Thanks, I missed that one.

Is there anything else that I have not found so far but which is about balancing individual clients across all services?

Remember that a CephFS mount, an RBD attachment, and an RGW session are three different clients from Ceph’s perspective.  If you actually mean rate limiting

(it looks like your sentence got truncated). But I was calling it QoS 
and not rate limiting, because I'm thinking of "balancing" or 
"rationing" the available iops between all clients. It would be ok for a 
single client to use all available iops of the whole cluster, as long as 
this client is the only one. Once there are more clients, there needs to 
be some limiting.

I am not 100% sure what we have in mind for QoS and/or rate limiting 
measures. I suppose it greatly depends on the available facilities. But 
I think the sort of thing we were thinking of was, for example, to 
prioritize VM disk iops (which would be RBD user nova) over glance or 
cinder (image or volume creation and management) iops. And probably 
prioritize those higher than RGW. But for that to be possible, it would 
need some cluster-wide "overview" of all those different services and 
their clients which all generate iops for the cluster.

The mechanism as described in 
https://docs.ceph.com/en/reef/rados/configuration/osd-config-ref/#dmclock-qos 
 actually seems quite promising. Except that it appears to put *all* 
client iops into a single bucket. I think that, for what I was thinking 
of, it would e.g. need to keep each client in a separate bucket, with 
separate reservation/limitation/weight values. But so far as I 
understand the text it doesn't work like that. I would love to be proven 
wrong here :-)

--
Olaf Seibert
Site Reliability Engineer

SysEleven GmbH
Boxhagener Straße 80
10245 Berlin

T +49 30 233 2012 0
F +49 30 616 7555 0

https://www.syseleven.de
https://www.linkedin.com/company/syseleven-gmbh/

Current system status always at:
https://www.syseleven-status.net/

Company headquarters: Berlin
Registered court: AG Berlin Charlottenburg, HRB 108571 Berlin
Managing directors: Andreas Hermann, Jens Ihlenfeld, Norbert Müller, 
Jens Plogsties
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx