I'm watching this thread with interest, for a few reasons. We have benefited a lot(!) from advice from various people in it over the years, there are some similarities between our setup and STFC's, and we haven't had been bothered by this issue so far. So, I'm intrigued. On 8/1/25 23:10, Thomas Byrne - STFC UKRI wrote: > Storage nodes are all some derivative of a 24 bay, 2U chassis (e.g 760XD2) > Single 25Gig connection, no jumboframes > HDDs range from 12-20TB SAS HDDs depending on year purchased, with collocated WAL/DBs on the HDDs. > All BlueStore OSDs > Mons have dedicated flash devices for their stores Our nodes are 740XD2 with 24x 16TB HDDs plus SSD for RocksDB, and 100Gb with jumbo frames (9000 MTU). Quincy deb pkgs on Ubuntu 20.04, so we're planning for some necessary upgrade work. We have target_max_misplaced_ratio 0.3% since the last time we had a lot of backfilling (aided by upmap tools) that value left us with plenty of performance capacity for users. We're at 75-80% full with some capacity for growth, and it is operating fine. Sometimes starting an OSD can take up to 20 minutes, so there may be some shared experience there. However, apart from a harrowing period last year[1] we live in HEALTH_OK most of the time. We also don't schedule the balancer to ever be off, because it is often pretty quiet. Typical output from our two clusters: pgs: 32591 active+clean 588 active+clean+scrubbing+deep 21 active+clean+scrubbing The big one has just shy of 3000 OSDs, which is half of Thomas' cluster. Perhaps that is a key difference. Our hardware failure though is markedly less than half the rate. I think we had one in December, and none this year so far. [1] https://ceph2024.sched.com/event/1ktWK/get-that-cluster-back-online-but-hurry-slowly-gregory-orange-pawsey-supercomputing-centre _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx