Re: Performance issues RGW (S3)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2024-06-11 01:01, Anthony D'Atri wrote:
To be clear, you don't need more nodes. You can add RGWs to the ones you already have. You have 12 OSD nodes - why not put an RGW on each?

Might be an option, just don't like the idea to host multiple components on nodes. But I'll consider it.

I really don't like mixing mon/mgr with other components because of coupled failure domains, and past experience with mon misbehavior, but many people do that. ymmv. With a bunch of RGWs none of them need grow to consume significant resources, and it can be difficult to get an RGW daemon to itself really use all of a dedicated node.

I am not sure adding more RGW's will increase the performance.

Just tested with 1 and with 2 RGW's:

Client 1 -> RGW Node A = 150-250 objects/s
Client 1 -> RGW Node A = 60-120 objects/s and simultaneously Client 2 -> RGW Node B = 60-120 objects/s. Together makes 150-250 objects/s.

So, it does not matter performance wise if I am using 1 or 2 RGW nodes.

Client 1 -> HAProxy -> 3 RGW's = 150-250 objects/s.


There are still serializations in the OSD and PG code. You have 240 OSDs, does your index pool have *at least* 256 PGs?
Index as the data pool has 256 PG's.
To be clear, that means whatever.rgw.buckets.index ?

No, sorry my bad. .index is 32 and .data is 256.

Oh, yeah. Does `ceph osd df` show you at the far right like 4-5 PG replicas on each OSD? You want (IMHO) to end up with 100-200, keeping each pool's pg_num to a power of 2 ideally.

No, my RBD pool is larger. My average PG per OSD is round 60-70.

Assuming all your pools span all OSDs, I suggest at a minimum 256 for .index and 8192 for .data, assuming you have only RGW pools. And would be included to try 512 / 8192. Assuming your other minor pools are at 32, I'd bump .log and .non-ec to 128 or 256 as well.

If you have RBD or other pools colocated, those numbers would change.



^ above assume disabling the autoscaler

I bumped my .data pool from 256 to 1024 and .index from 32 to 128. Also doubled the .non-e and .log pools. Performance wise I don't see any improvement. If I would see 10-20% improvement, I definitely would increase it to 512 / 8192. With 0.5MB object size I am still limited at about 150 up to 250 objects/s.

The disks aren't saturated. The wr await is mostly around 1ms and does not get higher when benchmarking with S3.

Other suggestions, or does anyone else has suggestions?
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux