Re: librbd 4k read/write?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/08/2023 22:04, Zakhar Kirpichenko wrote:

Hi,

You can use the following formula to roughly calculate the IOPS you can get
from a cluster: (Drive_IOPS * Number_of_Drives * 0.75) / Cluster_Size.

For example, for 60 10K rpm SAS drives each capable of 200 4K IOPS and a
replicated pool with size 3: (~200 * 60 * 0.75) / 3 = ~3000 IOPS with block
size = 4K.

Good approximation but some comments:

-This could apply to hdds, not to ssd/nvme.

-If you do not use external wal/db ssd device you would get lower results

-You would need to test with an io depth several times higher than your hdd count. In your case with 60 hdds you would use 256 or so, using 16 as you did will not stress the hdds to their max.

-Test with deep scrubbing off as it could impact performance specially with hdds.

/maged


That's what the OP is getting, give or take.

/Z

On Thu, 10 Aug 2023 at 20:20, Anthony D'Atri <aad@xxxxxxxxxxxxxx> wrote:


Good afternoon everybody!

I have the following scenario:
Pool RBD replication x3
5 hosts with 12 SAS spinning disks each
Old hardware?  SAS is mostly dead.

I'm using exactly the following line with FIO to test:
fio -ioengine=libaio -direct=1 -invalidate=1 -name=test -bs=4M -size=10G
-iodepth=16 -rw=write -filename=./test.img
On what kind of client?

If I increase the blocksize I can easily reach 1.5 GBps or more.

But when I use blocksize in 4K I get a measly 12 Megabytes per second,
which is quite annoying. I achieve the same rate if rw=read.
If your client is VM especially, check if you have IOPS throttling. With
small block sizes you'll throttle IOPS long before bandwidth.

Note: I tested it on another smaller cluster, with 36 SAS disks and got
the
same result.
SAS has a price premium over SATA, and still requires an HBA.  Many
chassis vendors really want you to buy an anachronistic RoC HBA.

Eschewing SAS and the HBA helps close the gap to justify SSDs, the TCO
just doesn't favor spinners.

Maybe the 5 host cluster is not
saturated by your current fio test. Try running 2 or 4 in parallel.

Agreed that Ceph is a scale out solution, not DAS, but note the difference
reported with a larger block size.

How is this related to 60 drives? His test is only on 3 drives at a time
not?

RBD volumes by and large will live on most or all OSDs in the pool.




I don't know exactly what to look for or configure to have any
improvement.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux