This sounds like you have widespread inconsistencies that are surfaced by scrubs, not caused by them. Frequent causes: * Using a RAID HBA with bugs (all of them, in my experience), with broken preserved cache replay, with forced writeback cache without BBU, etc. * Power was out longer than BBUs could last * Volatile cache enabled on HDDs * Client grade SSDs without PLP > On Mar 11, 2025, at 6:57 AM, Martin Konold <martin.konold@xxxxxxxxxx> wrote: > > > Hi, > > I suspect a hw issue. Please check the networks. > > Regards > --martin > > Am 11.03.2025 11:24 schrieb Marianne Spiller <marianne@xxxxxxxxxx>: > Dear list, > > I'm currently maintaining several Ceph (prod) installations. One of them consists of 3 MON hosts and 6 OSD hosts hosting 40 OSDs in total. And there are 5 separate Proxmox-Hosts - they only host the VMs and use the storage provided by Ceph, but they are not part of Ceph. > > The worst case happened: due to an outage, all these hosts crashed hardly the same time. > > Last week, I began to restart (only the Ceph hosts; Proxmox servers are still down). Ceph was very unhappy with the situation as a whole - one OSD host (and its 6 OSDs) is completely gone, some hardware issues (33 OSDs left, networking, PSU, I'm working on it) and 73 out of 129 PGs inconsistent. > > Meanwhile, the overall status of the cluster is "HEALTHY" again. > But nearly every day, one or two PGs get damaged. Never on the same OSDs. And there is no traffic on the storage as the virtualization hosts are not running. I see no further reason in the logs: everything is fine, scrub starts and leaves one or more PGs damaged. Repairing them is successful, but maybe next night, another PG is stuck. > > Do you have hints to investigate this any further? I would love to understand more before starting the Proxmox cluster again. Using Ceph 18.2.4 (Proxmox packages). > > Thanks a lot, > Marianne > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx