Re: librbd 4k read/write?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,
In case of pool a cluster where most pools are with erasure code 4+2, what would you consider as value for cluster_size?

Giuseppe

On 10.08.23, 21:06, "Zakhar Kirpichenko" <zakhar@xxxxxxxxx <mailto:zakhar@xxxxxxxxx>> wrote:




Hi,


You can use the following formula to roughly calculate the IOPS you can get
from a cluster: (Drive_IOPS * Number_of_Drives * 0.75) / Cluster_Size.


For example, for 60 10K rpm SAS drives each capable of 200 4K IOPS and a
replicated pool with size 3: (~200 * 60 * 0.75) / 3 = ~3000 IOPS with block
size = 4K.


That's what the OP is getting, give or take.


/Z


On Thu, 10 Aug 2023 at 20:20, Anthony D'Atri <aad@xxxxxxxxxxxxxx <mailto:aad@xxxxxxxxxxxxxx>> wrote:


>
>
> >
> > Good afternoon everybody!
> >
> > I have the following scenario:
> > Pool RBD replication x3
> > 5 hosts with 12 SAS spinning disks each
>
> Old hardware? SAS is mostly dead.
>
> > I'm using exactly the following line with FIO to test:
> > fio -ioengine=libaio -direct=1 -invalidate=1 -name=test -bs=4M -size=10G
> > -iodepth=16 -rw=write -filename=./test.img
>
> On what kind of client?
>
> > If I increase the blocksize I can easily reach 1.5 GBps or more.
> >
> > But when I use blocksize in 4K I get a measly 12 Megabytes per second,
> > which is quite annoying. I achieve the same rate if rw=read.
>
> If your client is VM especially, check if you have IOPS throttling. With
> small block sizes you'll throttle IOPS long before bandwidth.
>
> > Note: I tested it on another smaller cluster, with 36 SAS disks and got
> the
> > same result.
>
> SAS has a price premium over SATA, and still requires an HBA. Many
> chassis vendors really want you to buy an anachronistic RoC HBA.
>
> Eschewing SAS and the HBA helps close the gap to justify SSDs, the TCO
> just doesn't favor spinners.
>
> > Maybe the 5 host cluster is not
> > saturated by your current fio test. Try running 2 or 4 in parallel.
>
>
> Agreed that Ceph is a scale out solution, not DAS, but note the difference
> reported with a larger block size.
>
> >How is this related to 60 drives? His test is only on 3 drives at a time
> not?
>
> RBD volumes by and large will live on most or all OSDs in the pool.
>
>
>
>
> >
> > I don't know exactly what to look for or configure to have any
> improvement.
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx <mailto:ceph-users@xxxxxxx>
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx <mailto:ceph-users-leave@xxxxxxx>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx <mailto:ceph-users@xxxxxxx>
> To unsubscribe send an email to ceph-users-leave@xxxxxxx <mailto:ceph-users-leave@xxxxxxx>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx <mailto:ceph-users@xxxxxxx>
To unsubscribe send an email to ceph-users-leave@xxxxxxx <mailto:ceph-users-leave@xxxxxxx>



_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux