Yes, device caching is disabled. The ODSs in question use a
separate DB/WAL device on flash. Could that be the cause of
IOPS over-estimation?
Is it better for me to manually set
osd_mclock_max_capacity_iops_hdd to a reasonable number or
will things perform OK as long as IOPS are over-estimated
consistently everywhere?
Vlad
On 9/6/22 11:50, David Orman wrote:
Yes. Rotational drives can generally do 100-200IOPS (some
outliers, of course). Do you have all forms of caching
disabled on your storage controllers/disks?
On Tue, Sep 6, 2022 at 11:32 AM Vladimir Brik
<vladimir.brik@xxxxxxxxxxxxxxxx
<mailto:vladimir.brik@xxxxxxxxxxxxxxxx>> wrote:
Setting osd_mclock_force_run_benchmark_on_init to true and
restarting OSDs fixed the problem of high variability.
However, osd_mclock_max_capacity_iops_hdd is now over 2000.
That's way too much for spinning disks, isn't it?
Vlad
On 9/1/22 03:54, Sridhar Seshasayee wrote:
> Hello Vladimir,
>
> I have noticed that our
osd_mclock_max_capacity_iops_hdd
> varies widely for OSDs on identical drives in
identical
> machines (from ~600 to ~2800).
>
> The IOPS shouldn't vary widely if the drives are of
similar
> age and running
> the same workloads. The
osd_mclock_max_capacity_iops_hdd is
> determined
> using Ceph's osd bench during osd boot-up. From our
testing
> on HDDs, the
> tool shows fairly consistent numbers (+/- a few 10s
of IOPS)
> for a given HDD.
> For more details please see:
>
https://docs.ceph.com/en/latest/rados/configuration/mclock-config-ref/#osd-capacity-determination-automated <https://docs.ceph.com/en/latest/rados/configuration/mclock-config-ref/#osd-capacity-determination-automated> <https://docs.ceph.com/en/latest/rados/configuration/mclock-config-ref/#osd-capacity-determination-automated <https://docs.ceph.com/en/latest/rados/configuration/mclock-config-ref/#osd-capacity-determination-automated>>
>
> In your case, it would be good to take another look
at the
> subset of HDDs
> showing degraded/lower IOPS performance and check if
there
> are any
> underlying issues. You could run the osd bench
against the
> affected osds as
> described in the link below or your preferred tool to
get a
> better understanding:
>
>
https://docs.ceph.com/en/latest/rados/configuration/mclock-config-ref/#benchmarking-test-steps-using-osd-bench <https://docs.ceph.com/en/latest/rados/configuration/mclock-config-ref/#benchmarking-test-steps-using-osd-bench> <https://docs.ceph.com/en/latest/rados/configuration/mclock-config-ref/#benchmarking-test-steps-using-osd-bench <https://docs.ceph.com/en/latest/rados/configuration/mclock-config-ref/#benchmarking-test-steps-using-osd-bench>>
>
> Is such great variation a problem? What effect on
> performance does this have?
>
> The mClock profiles use the IOPS capacity of each osd to
> allocate IOPS
> resources to ops like client, recovery, scrubs and so
on.
> Therefore, a lower IOPS
> capacity will have an impact on these ops and
therefore it
> would make sense to
> check the health of the HDDs that are showing lower than
> expected IOPS numbers.
> -Sridhar
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
<mailto:ceph-users@xxxxxxx>
To unsubscribe send an email to ceph-users-leave@xxxxxxx
<mailto:ceph-users-leave@xxxxxxx>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx