Hi Frank, I don't have an operational workaround, the patch https://github.com/ceph/ceph/pull/46911/commits/f43f596aac97200a70db7a70a230eb9343018159 is simple and can be applied cleanly. Yes, restarting the OSD will clear pool entries, you can restart it when the bluestore_onode items are very low (e.g less than 10) if it really helps, but I think you'll need to tune and monitor the performance until you can get a number that is most suitable for your cluster. But it can't help with the crash, since in general, the crash itself is basically a restart. Regards, Dongdong On Tue, Jan 10, 2023 at 8:21 PM Serkan Çoban <cobanserkan@xxxxxxxxx> wrote: > Slot 19 is inside the chassis? Do you check chassis temperature? I > sometimes have more failure rate in chassis HDDs than in front of the > chassis. In our case it was related to the temperature difference. > > On Tue, Jan 10, 2023 at 1:28 PM Frank Schilder <frans@xxxxxx> wrote: > > > > Following up on my previous post, we have identical OSD hosts. The very > strange observation now is, that all outlier OSDs are in exactly the same > disk slot on these hosts. We have 5 problematic OSDs and they are all in > slot 19 on 5 different hosts. This is an extremely strange and unlikely > co-incidence. > > > > Are there any specific conditions for this problem to be present or > amplified that could have to do with hardware? > > > > Best regards, > > ================= > > Frank Schilder > > AIT Risø Campus > > Bygning 109, rum S14 > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx