I do not have *direct* experience with that model but can share some speculations: That is a *consumer* model, known as “client” in the SSD industry. It’s also QLC. It’s optimized for PB/$. I suspect that at least one of several things is going on. * Cliffing: Client SSDs are architected for an intermittent duty cycle, not 24x7, and for various reasons sustained performance can drop after a certain amount of sustained time or data * How full are the OSDs? `ceph osd df` Drives like this appear to allocate an SLC cache/staging area that decreases as the drive fills up — a design decision that I’ve never fully understood. I suspect that is at play here. This article touches on this phenomenon: https://www.tweaktown.com/articles/8819/samsung-860-qvo-ssd-review-rip-consumer-hdds/index.html <https://www.tweaktown.com/articles/8819/samsung-860-qvo-ssd-review-rip-consumer-hdds/index.html> * Are you using a RAID HBA? If so, do you have a VD wrapped around each drive, or do you have HBA caching enabled? * It also likely does not feature power loss protection (PLP), and caching may be hurting you https://www.mail-archive.com/ceph-users@xxxxxxx/msg04704.html <https://www.mail-archive.com/ceph-users@xxxxxxx/msg04704.html> https://yourcmc.ru/wiki/index.php?title=Ceph_performance&mobileaction=toggle_view_desktop#Drive_cache_is_slowing_you_down <https://yourcmc.ru/wiki/index.php?title=Ceph_performance&mobileaction=toggle_view_desktop#Drive_cache_is_slowing_you_down> * Have you checked SMART counters for projected lifetime remaining, used vs remaining spare blocks, etc? Please feel free to collect the output from `smartctl -a` for all drives, post the text file somewhere, and send a link to it. > On Dec 27, 2022, at 4:40 AM, hosseinz8050@xxxxxxxxx wrote: > > Thanks Anthony > I have a cluster with QLC SSD disks (Samsung QVO 860). The cluster works for 2 year. Now all OSDs return 12 iops when running tell bench which is very slow. But I Buy new QVO disks yesterday, and I added this new disk to cluster. For the first 1 hour, I got 100 iops from this new OSD. But after 1 Hour, this new disk (OSD) returns to iops 12 again as the same as other OLD OSDs. > I can not imagine what happening?!! > > On Tuesday, December 27, 2022 at 12:18:07 AM GMT+3:30, Anthony D'Atri <aad@xxxxxxxxxxxxxx> wrote: > > > My understanding is that when you ask an OSD to bench (via the admin socket), only that OSD executes, there is no replication. Replication is a function of PGs. > > Thus, this is a narrowly-focused tool with both unique advantages and disadvantages. > > > > > On Dec 26, 2022, at 12:47 PM, hosseinz8050@xxxxxxxxx <mailto:hosseinz8050@xxxxxxxxx> wrote: > > > > Hi experts,I want to know, when I execute ceph tell osd.x bench command, is replica 3 considered in the bench or not? I mean, for example in case of replica 3, when I executing tell bench command, replica 1 of bench data write to osd.x, replica 2 write to osd.y and replica 3 write to osd.z? If this is true, it means that I can not get benchmark of only one of my OSD in the cluster because the IOPS and throughput of 2 other for example slow OSDs will affect the result of tell bench command for my target OSD.Is that true? > > Thanks in advance. > > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx <mailto:ceph-users@xxxxxxx> > > To unsubscribe send an email to ceph-users-leave@xxxxxxx <mailto:ceph-users-leave@xxxxxxx> > > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx <mailto:ceph-users@xxxxxxx> > To unsubscribe send an email to ceph-users-leave@xxxxxxx <mailto:ceph-users-leave@xxxxxxx> > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx