On 1/25/19 8:33 AM, Gregory Farnum wrote: > This doesn’t look familiar to me. Is the cluster still doing recovery so > we can at least expect them to make progress when the “out” OSDs get > removed from the set? The recovery has already finished. It resolves itself, but in the meantime I saw many PGs in the backfill_toofull state for a long time. This is new since Mimic. Wido > On Tue, Jan 22, 2019 at 2:44 PM Wido den Hollander <wido@xxxxxxxx > <mailto:wido@xxxxxxxx>> wrote: > > Hi, > > I've got a couple of PGs which are stuck in backfill_toofull, but none > of them are actually full. > > "up": [ > 999, > 1900, > 145 > ], > "acting": [ > 701, > 1146, > 1880 > ], > "backfill_targets": [ > "145", > "999", > "1900" > ], > "acting_recovery_backfill": [ > "145", > "701", > "999", > "1146", > "1880", > "1900" > ], > > I checked all these OSDs, but they are all <75% utilization. > > full_ratio 0.95 > backfillfull_ratio 0.9 > nearfull_ratio 0.9 > > So I started checking all the PGs and I've noticed that each of these > PGs has one OSD in the 'acting_recovery_backfill' which is marked as > out. > > In this case osd.1880 is marked as out and thus it's capacity is shown > as zero. > > [ceph@ceph-mgr ~]$ ceph osd df|grep 1880 > 1880 hdd 4.54599 0 0 B 0 B 0 B 0 0 27 > [ceph@ceph-mgr ~]$ > > This is on a Mimic 13.2.4 cluster. Is this expected or is this a unknown > side-effect of one of the OSDs being marked as out? > > Thanks, > > Wido > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com