You removed OSD from 3 different hosts? I’m surprised it was healthy as purging stopped OSD is only going to ‘clean' crushmap. Is there any recover in progress? - Etienne Menguy etienne.menguy@xxxxxxxx > On 17 Jan 2022, at 15:41, Rafael Diaz Maurin <Rafael.DiazMaurin@xxxxxxxxxxxxxxx> wrote: > > Hello, > > Le 17/01/2022 à 15:31, Etienne Menguy a écrit : >> Your cluster was healthy before purging osd? >> How much time did you wait between stoping osd and purging them? > > Yes my cluster was healthy. > > I wait a few minutes. > > > Rafael > > > >> >> Étienne >> >>> On 17 Jan 2022, at 15:24, Rafael Diaz Maurin <Rafael.DiazMaurin@xxxxxxxxxxxxxxx> wrote: >>> >>> Hello, >>> >>> All my pools on the cluster are replicated (x3). >>> >>> I purged some OSD (after I stopped them) and remove the disks from the servers, and now I have 4 PGs in stale+undersized+degraded+peered. >>> >>> Reduced data availability: 4 pgs inactive, 4 pgs stale >>> >>> pg 1.561 is stuck stale for 39m, current state stale+undersized+degraded+peered, last acting [64] >>> pg 1.af2 is stuck stale for 39m, current state stale+undersized+degraded+peered, last acting [63] >>> pg 3.3 is stuck stale for 39m, current state stale+undersized+degraded+peered, last acting [48] >>> pg 9.5ca is stuck stale for 38m, current state stale+undersized+degraded+peered, last acting [49] >>> >>> >>> Those 4 OSDs have been purged, so ther aren't anymore in the crushmap. >>> >>> I tried a pg repair : >>> ceph pg repair 1.561 >>> ceph pg repair 1.af2 >>> ceph pg repair 3.3 >>> ceph pg repair 9.5ca >>> >>> >>> The PGs are remapped but non of the degraded objects have been repaired >>> >>> ceph pg map 9.5ca >>> osdmap e355782 pg 9.5ca (9.5ca) -> up [54,75,82] acting [54,75,82] >>> ceph pg map 3.3 >>> osdmap e355782 pg 3.3 (3.3) -> up [179,180,107] acting [179,180,107] >>> ceph pg map 1.561 >>> osdmap e355785 pg 1.561 (1.561) -> up [70,188,87] acting [70,188,87] >>> ceph pg map 1.af2 >>> osdmap e355789 pg 1.af2 (1.af2) -> up [189,74,184] acting [189,74,184] >>> >>> How can I succeed in reparing my 4 PGs ? >>> >>> This affect the cephfs-metadata pool, and the filesystem is degraded because the rank0 mds node stuck in rejoin state. >>> >>> >>> Thank you. >>> >>> Rafael >>> >>> >>> -- >>> Rafael Diaz Maurin >>> DSI de l'Université de Rennes 1 >>> Pôle Infrastructures, équipe Systèmes >>> 02 23 23 71 57 >>> >>> >>> >>> _______________________________________________ >>> ceph-users mailing list -- ceph-users@xxxxxxx >>> To unsubscribe send an email to ceph-users-leave@xxxxxxx > > > -- > Rafael Diaz Maurin > DSI de l'Université de Rennes 1 > Pôle Infrastructures, équipe Systèmes > 02 23 23 71 57 _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx