Re: Wide variation in osd_mclock_max_capacity_iops_hdd

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

My experience with osd bench is not good either. Firstly, its results are very fluctuating and secondly it always returns unrealistically large numbers. I always get >1000 IOPs for HDDs that can do 100-150 in reality. My suspicion is that the internal benchmark is using an IO path that includes system buffers.

Same holds for the build-in rbd bench. The only way I found to get realistic numbers is to use fio with direct IO.

I think a very long and hard look at the implementation of the internal benchmarks is required here, or just replace them with appropriate fio executions.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Anthony D'Atri <aad@xxxxxxxxxxxxxx>
Sent: 07 September 2022 20:36:39
To: Sridhar Seshasayee
Cc: David Orman; ceph-users
Subject:  Re: Wide variation in osd_mclock_max_capacity_iops_hdd

SATA/SAS volatile drive cache comes up often - have we considered having the OSD startup process disable it by default? Or would that violate the principle of least astonishment, especially since drive behavior varies considerably?

> On Sep 7, 2022, at 05:58, Sridhar Seshasayee <sseshasa@xxxxxxxxxx> wrote:
>
> Hi Vladimir,
>
> Yes, device caching is disabled. The ODSs in question use a
>> separate DB/WAL device on flash. Could that be the cause of
>> IOPS over-estimation?
>>
>
> 2000 IOPS definitely looks high for a HDD. The in-built osd bench tool
> writes to the backing device of the osd to determine the IOPS capacity.
> With DB/WAL configured, I wouldn't expect a wide variation.
>
> To confirm the IOPS measurement, you can run the following commands:
> $ ceph tell osd.N cache drop
>
> followed by,
>
> $ ceph tell osd.N bench 12288000 4096 4194304 100
>
> Where N is the osd id.
>
> You can run the above sequence a few times to confirm if there are any wide
> variations and
> if so, it probably warrants more investigation into why that's the case.
>
> Also, may I know the following:
> - what kind of HDDs are you using in your environment?
> - Ceph version
>
>
> Is it better for me to manually set
>> osd_mclock_max_capacity_iops_hdd to a reasonable number or
>> will things perform OK as long as IOPS are over-estimated
>> consistently everywhere?
>>
>
> I think it's better to figure out why the IOPS are on the higher side first.
>
> -Sridhar
>
> <https://www.redhat.com>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux