Hi all, My experience with osd bench is not good either. Firstly, its results are very fluctuating and secondly it always returns unrealistically large numbers. I always get >1000 IOPs for HDDs that can do 100-150 in reality. My suspicion is that the internal benchmark is using an IO path that includes system buffers. Same holds for the build-in rbd bench. The only way I found to get realistic numbers is to use fio with direct IO. I think a very long and hard look at the implementation of the internal benchmarks is required here, or just replace them with appropriate fio executions. Best regards, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14 ________________________________________ From: Anthony D'Atri <aad@xxxxxxxxxxxxxx> Sent: 07 September 2022 20:36:39 To: Sridhar Seshasayee Cc: David Orman; ceph-users Subject: Re: Wide variation in osd_mclock_max_capacity_iops_hdd SATA/SAS volatile drive cache comes up often - have we considered having the OSD startup process disable it by default? Or would that violate the principle of least astonishment, especially since drive behavior varies considerably? > On Sep 7, 2022, at 05:58, Sridhar Seshasayee <sseshasa@xxxxxxxxxx> wrote: > > Hi Vladimir, > > Yes, device caching is disabled. The ODSs in question use a >> separate DB/WAL device on flash. Could that be the cause of >> IOPS over-estimation? >> > > 2000 IOPS definitely looks high for a HDD. The in-built osd bench tool > writes to the backing device of the osd to determine the IOPS capacity. > With DB/WAL configured, I wouldn't expect a wide variation. > > To confirm the IOPS measurement, you can run the following commands: > $ ceph tell osd.N cache drop > > followed by, > > $ ceph tell osd.N bench 12288000 4096 4194304 100 > > Where N is the osd id. > > You can run the above sequence a few times to confirm if there are any wide > variations and > if so, it probably warrants more investigation into why that's the case. > > Also, may I know the following: > - what kind of HDDs are you using in your environment? > - Ceph version > > > Is it better for me to manually set >> osd_mclock_max_capacity_iops_hdd to a reasonable number or >> will things perform OK as long as IOPS are over-estimated >> consistently everywhere? >> > > I think it's better to figure out why the IOPS are on the higher side first. > > -Sridhar > > <https://www.redhat.com> > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx