* Monitors now have a config option ``mon_osd_warn_num_repaired``, 10 by default. If any OSD has repaired more than this many I/O errors in stored data a ``OSD_TOO_MANY_REPAIRS`` health warning is generated. Look at `dmesg` and the underlying drive’s SMART counters. You almost certainly have a drive that is failing and should be replaced. In releases prior to Nautilus an unrecovered read error would often cause the OSD to crash, eg. from a drive slipping a bad block. — aad > On Oct 9, 2020, at 4:58 PM, Tecnología CHARNE.NET <tecno@xxxxxxxxxx> wrote: > > Hello! > > Today, I started the morning with a WARNING STATUS on our Ceph cluster. > > > # ceph health detail > > HEALTH_WARN Too many repaired reads on 1 OSDs > > [WRN] OSD_TOO_MANY_REPAIRS: Too many repaired reads on 1 OSDs > > osd.67 had 399911 reads repaired > > > I made "ceph osd out 67" and PGs where migrated to another OSDs. > > I stopped the osd.67 daemon, inspected the logs, etc... > > Then I started the daemon and made "# ceph osd in 67". > > OSD started backfilling with some PGs and no other error appeared in the rest of the day, but Warning status still remains. > > Can I clear it? Shoud I remove the osd and start with a new one? > > Thanks in advance for your time! > > > Javier.- > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx