Re: Wide variation in osd_mclock_max_capacity_iops_hdd

Sridhar Seshasayee <sseshasa@xxxxxxxxxx> · Thu, 8 Sep 2022 17:43:07 +0530

Hi all,

My experience with osd bench is not good either. Firstly, its results are
> very fluctuating and secondly it always returns unrealistically large
> numbers. I always get >1000 IOPs for HDDs that can do 100-150 in reality.
> My suspicion is that the internal benchmark is using an IO path that
> includes system buffers.
>

In our testing with HDDs, we didn't observe such a wide variation using osd
bench.
The difference between osd bench and Fio was a few 10s of IOPS with HDDs
(with
osd bench showing higher numbers since it directly queues I/O at the
objectstore level).
That's why the osd bench is used to get a rough estimate of the osd's max
IOPS capacity
on osd boot-up.

The osd bench actually performs the test on the osd's small metadata
partition setup on the
backing device itself and so I suspect something else is at play as others
have pointed out.

> Same holds for the build-in rbd bench. The only way I found to get
> realistic numbers is to use fio with direct IO.
>

In this case, I too would suggest running Fio with direct IO and override
the
osd_mclock_max_capacity_iops_hdd option with the more realistic number.

>
> I think a very long and hard look at the implementation of the internal
> benchmarks is required here, or just replace them with appropriate fio
> executions.
>
>
We will have a re-look at this. In the interim, if there's a wide variation
even
with the drive cache disabled, it would make sense to use IOPS obtained
using Fio and override the osd_mclock_max_capacity_iops_[hdd|ssd] option.

Thanks,
-Sridhar

<https://www.redhat.com>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx