My PGs are healthy now, but the underlying problem itself is not fixed. I was interested if someone knew a fast fix to get the PGs complete right away. The down OSDs have been shut down a long time ago and are sitting in a different crush root. It was 1 OSD in an HDD pool that I'm re-organising right now, which was temporarily down (1 out of the 275). I should have mentioned that I know that a long-standing bug in ceph is the reason for this partial data loss (https://tracker.ceph.com/issues/46847). I thought I had a fully functional workaround, but it turned out that I was wrong. My workaround fixes all incomplete PGs, except PGs that are in the state "backfilling" at the time of OSD restart. I will file a new tracker item as this looks like a catastrophic bug. Any cluster that is rebalancing, either after adding disks, increasing pg[p]_num on a pool or similar operations is in danger. You will find many threads related to this problem, but the actual underlying bug has never been addressed completely. Some people actually lost data due to this, in particular, EC pools can become damaged beyond repair. From all the threads I found, this seems to be the one and only long-standing bug in ceph/rados that can cause data loss. A lot of clusters are affected, people are mostly just lucky. Reports date back to Luminous all the way up to Nautilus. Best regards, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14 ________________________________________ From: Amudhan P <amudhan83@xxxxxxxxx> Sent: 09 November 2020 03:16:40 To: Frank Schilder Cc: ceph-users Subject: Re: pg xyz is stuck undersized for long time Hi Frank, You said only one OSD is down but in ceph status shows more than 20 OSD is down. Regards, Amudhan On Sun 8 Nov, 2020, 12:13 AM Frank Schilder, <frans@xxxxxx<mailto:frans@xxxxxx>> wrote: Hi all, I moved the crush location of 8 OSDs and rebalancing went on happily (misplaced objects only). Today, osd.1 crashed, restarted and rejoined the cluster. However, it seems not to re-join some PGs it was a member of. I have now undersized PGs for no real reason I would believe: PG_DEGRADED Degraded data redundancy: 52173/2268789087 objects degraded (0.002%), 2 pgs degraded, 7 pgs undersized pg 11.52 is stuck undersized for 663.929664, current state active+undersized+remapped+backfilling, last acting [237,60,2147483647,74,233,232,292,86] The up and acting sets are: "up": [ 237, 2, 74, 289, 233, 232, 292, 86 ], "acting": [ 237, 60, 2147483647, 74, 233, 232, 292, 86 ], How can I get the PG to complete peering and osd.1 to join? I have an unreasonable number of degraded objects where the missing part is on this OSD. For completeness, here the cluster status: # ceph status cluster: id: ... health: HEALTH_ERR noout,norebalance flag(s) set 1 large omap objects 35815902/2268938858 objects misplaced (1.579%) Degraded data redundancy: 46122/2268938858 objects degraded (0.002%), 2 pgs degraded, 7 pgs undersized Degraded data redundancy (low space): 28 pgs backfill_toofull services: mon: 3 daemons, quorum ceph-01,ceph-02,ceph-03 mgr: ceph-01(active), standbys: ceph-03, ceph-02 mds: con-fs2-1/1/1 up {0=ceph-08=up:active}, 1 up:standby-replay osd: 299 osds: 275 up, 275 in; 301 remapped pgs flags noout,norebalance data: pools: 11 pools, 3215 pgs objects: 268.8 M objects, 675 TiB usage: 854 TiB used, 1.1 PiB / 1.9 PiB avail pgs: 46122/2268938858 objects degraded (0.002%) 35815902/2268938858 objects misplaced (1.579%) 2907 active+clean 219 active+remapped+backfill_wait 47 active+remapped+backfilling 28 active+remapped+backfill_wait+backfill_toofull 6 active+clean+scrubbing+deep 5 active+undersized+remapped+backfilling 2 active+undersized+degraded+remapped+backfilling 1 active+clean+scrubbing io: client: 13 MiB/s rd, 196 MiB/s wr, 2.82 kop/s rd, 1.81 kop/s wr recovery: 57 MiB/s, 14 objects/s Thanks and best regards, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14 _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx> To unsubscribe send an email to ceph-users-leave@xxxxxxx<mailto:ceph-users-leave@xxxxxxx> _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx