> > My experience with osd bench is not good either > > it seems it was recently "fixed" by writing "a"'s instead of zeroes: Thanks for pointing that out. Writing binary zeros is really bad as a lot of controllers interpret this as a trim or similar cheap operation and nothing will happen on disk physically (writing zeros is optimised, the sector is just marked as wiped instead of zeroed-out). Writing "a"'s might be better, but again, a constant sequence of characters will not give realistic base line results due to other likely optimisation paths in the controllers. Experiments I made indicate that only random data, for example, collected from /dev/urandom into memory a-priory will properly benchmark the write path all the way to the disk as it will be incompressible data not triggering any short-cuts in the system. Another point is the amount of data to write to get realistic estimates. A 30s burst of benchmark IO will usually not do. I found 5 minutes to be a bare minimum to get stable results. And even these change as the disk ages and/or fills up. However, 5 minutes is waaay too long for OSD startup. I think this might be a point where estimates of an OSD's performance cannot be deduced from a quick and dirty benchmark, but should rather come from actual IO stats, for example, commit latencies depending on IO size etc. The way ioping does it, but using the data from actual user IO for the disk ping. This would be really good, no overhead and also adjust at run time to disk ageing and usage effects. For a number that has such a strong influence on proper functioning of operations scheduling I would not go with any test method that cannot be validated with another. If the osd bench test and fio give largely different results, the assumption that osd bench is not doing it right is the save one. Fio and ioping are really good tools that provide comparable numbers by completely different methods. I would probably adopt one of these in favour of trying to get a third one right. Best regards, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14 ________________________________________ From: Sven Kieske <S.Kieske@xxxxxxxxxxx> Sent: 08 September 2022 14:04:05 To: ceph-users@xxxxxxx; Frank Schilder Cc: aad@xxxxxxxxxxxxxx; ormandj@xxxxxxxxxxxx; sseshasa@xxxxxxxxxx Subject: Re: Re: Wide variation in osd_mclock_max_capacity_iops_hdd On Do, 2022-09-08 at 08:22 +0000, Frank Schilder wrote: > My experience with osd bench is not good either it seems it was recently "fixed" by writing "a"'s instead of zeroes: https://github.com/ceph/ceph/commit/db045e005fab218f2bb270b7cb60b62abbbe3619 tongue in cheek: not sure that this is a good benchmark though, even after the change. writing good benchmark IO patterns is hard and are highly workload dependent. so I guess the usual answer still applies: write your own fio based benchmark for your usecase. it would be cool to compile a sort of "standard" ceph benchmark fio testsuite on github or some other public git host, if someone is interested in this kind of stuff? -- Mit freundlichen Grüßen / Regards Sven Kieske Systementwickler / systems engineer Mittwald CM Service GmbH & Co. KG Königsberger Straße 4-6 32339 Espelkamp Tel.: 05772 / 293-900 Fax: 05772 / 293-333 https://www.mittwald.de Geschäftsführer: Robert Meyer, Florian Jürgens St.Nr.: 331/5721/1033, USt-IdNr.: DE814773217, HRA 6640, AG Bad Oeynhausen Komplementärin: Robert Meyer Verwaltungs GmbH, HRB 13260, AG Bad Oeynhausen Informationen zur Datenverarbeitung im Rahmen unserer Geschäftstätigkeit gemäß Art. 13-14 DSGVO sind unter www.mittwald.de/ds abrufbar. _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx