I'm in a similar but not identical situation. I was in the middle of a rebalance on a small test cluster, without about 1% of pgs degraded, and shut the cluster entirely down for maintenance. On startup, many pgs are entirely unknown, and most stale. In fact most pgs can't be queried! No mon failures. No obvious signs of OSD failure (and the problem is too widespread for that.) Is there a specific way to force OSDs to rescan and re-advertise their pgs? Is there a specific startup order that fixes this, i.e., start all OSDs first and then start mons? I'm baffled, Jeremy On Mon, Feb 1, 2021 at 10:43 PM Wido den Hollander <wido@xxxxxxxx> wrote: > > > On 01/02/2021 22:48, Tony Liu wrote: > > Hi, > > > > With 3 replicas, a pg hs 3 osds. If all those 3 osds are down, > > the pg becomes unknow. Is that right? > > > > Yes. As no OSD can report the status to the MONs. > > > If those 3 osds are replaced and in and on, is that pg going to > > be eventually back to active? Or anything else has to be done > > to fix it? > > > > If you can bring back the OSDs without wiping them: Yes > > As you mention the word 'replaced' I was wondering what you mean by > that. If you replace the disks without data recovery the PGs will be lost. > > So you need to bring back the OSDs with their data in tact for the PG to > come back online. > > Wido > > > > > Thanks! > > Tony > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > -- Jeremy Austin jhaustin@xxxxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx