Re: Performance issues RGW (S3)

sinan@xxxxxxxx · Thu, 13 Jun 2024 20:46:55 +0200

I have doing some further testing.

My RGW pool is placed on spinning disks.
I created a 2nd RGW data pool, placed on flash disks.

Benchmarking on HDD pool:
Client 1 -> 1 RGW Node: 150 obj/s
Client 1-5 -> 1 RGW Node: 150 ob/s (30 obj/s each client)
Client 1 -> HAProxy -> 3 RGW Nodes: 150 obj/s
Client 1-5 -> HAProxy -> 3 RGW Nodes: 150 obj/s (30 obj/s each client)

I did the same tests towards the RGW pool on flash disks: same results

So, it doesn't matter if my pool is hosted on HDD or SSD.
It doesn't matter if I am using 1 RGW or 3 RGW nodes.
It doesn't matter if I am using 1 client or 5 clients.

I am constantly limited at around 140-160 objects/s.

I see some TCP Retransmissions on the RGW Node, but maybe thats 
'normal'.

Any ideas/suggestions?

On 2024-06-11 22:08, Anthony D'Atri wrote:

I am not sure adding more RGW's will increase the performance.

That was a tangent.

To be clear, that means whatever.rgw.buckets.index ?
No, sorry my bad. .index is 32 and .data is 256.
Oh, yeah. Does `ceph osd df` show you at the far right like 4-5 PG 
replicas on each OSD?  You want (IMHO) to end up with 100-200, 
keeping each pool's pg_num to a power of 2 ideally.

No, my RBD pool is larger. My average PG per OSD is round 60-70.

Ah.  Aim for 100-200 with spinners.

Assuming all your pools span all OSDs, I suggest at a minimum 256 for 
.index and 8192 for .data, assuming you have only RGW pools.  And 
would be included to try 512 / 8192.  Assuming your  other minor 
pools are at 32, I'd bump .log and .non-ec to 128 or 256 as well.
If you have RBD or other pools colocated, those numbers would change.
^ above assume disabling the autoscaler

I bumped my .data pool from 256 to 1024 and .index from 32 to 128.

Your index pool still only benefits from half of your OSDs with a value 
of 128.

Also doubled the .non-e and .log pools. Performance wise I don't see 
any improvement. If I would see 10-20% improvement, I definitely would 
increase it to 512 / 8192.
With 0.5MB object size I am still limited at about 150 up to 250 
objects/s.

The disks aren't saturated. The wr await is mostly around 1ms and does 
not get higher when benchmarking with S3.

Trust iostat about as far as you can throw it.

Other suggestions, or does anyone else has suggestions?

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx