Re: PG_BACKFILL_FULL

Boris Behrens <bb@xxxxxxxxx> · Mon, 16 Jan 2023 15:17:05 +0100

Hmm.. I ran into some similar issue.

IMHO there are two ways to work around the problem until the new disk in
place:
1. change the backfill full threshold (I use these commands:
https://www.suse.com/support/kb/doc/?id=000019724)
2. reweight the backfill full OSDs just a little bit, so they move data to
disks that are free enough (i.e. `ceph osd reweight osd.60 0.9`) - if you
have enough capacity in the cluster (577+ OSDs should be able to take that
:) )

Cheers
 Boris

Am Mo., 16. Jan. 2023 um 15:01 Uhr schrieb Iztok Gregori <
iztok.gregori@xxxxxxxxxx>:

> Hi to all!
>
> We are in a situation where we have 3 PG in
> "active+remapped+backfill_toofull". It happened when we executed a
> "gentle-reweight" to zero of one OSD (osd.77) to swap it with a new one
> (the current one registered some read errors and it's to be replaced
> just-in-case).
>
> > # ceph health detail:
> > [WRN] PG_BACKFILL_FULL: Low space hindering backfill (add storage if
> this doesn't resolve itself): 3 pgs backfill_toofull
> >     pg 10.46c is active+remapped+backfill_toofull, acting [77,607,96]
> >     pg 10.8ad is active+remapped+backfill_toofull, acting [577,152,77]
> >     pg 10.b15 is active+remapped+backfill_toofull, acting [483,348,77]
>
> Our cluster is a little unbalanced and we have 7 OSD nearfull (I think
> it's because we have 4 nodes with 6 TB disks and the other 19 have 10
> TB, but should be unrelated, why is the cluster unbalanced I mean, to
> the backfill_toofull ) not by too much (less then 88%). I'm not too much
> worried about it, we will add new storage this month (if the servers
> will arrive) and we will get rid of the old 6 TB servers.
>
> If I dump the PGs I see, if I'm not mistaken, that the osd.77 will be
> "replaced" by the osd.60, which is one of the nearfull ones (the top one
> with 87.53% used).
>
>
> > # ceph pg dump:
> >
> > 10.b15     37236                   0         0      37236        0
> 155249620992            0           0   5265      5265
> active+remapped+backfill_toofull  2023-01-16T14:45:46.155801+0100
> 305742'144106      305742:901513   [483,348,60]         483   [483,348,77]
>            483      305211'144056  2023-01-11T10:20:56.600135+0100
> 305211'144056  2023-01-11T10:20:56.600135+0100              0
> > 10.8ad     37518                   0         0      37518        0
> 156345024512            0           0   5517      5517
> active+remapped+backfill_toofull  2023-01-16T14:45:38.510038+0100
> 305213'142117      305742:937228   [577,60,152]         577   [577,152,77]
>            577      303828'142043  2023-01-06T17:52:02.523104+0100
> 303334'141645  2022-12-20T17:39:22.668083+0100              0
> > 10.46c     36710                   0         0      36710        0
> 153023443456            0           0   8172      8172
> active+remapped+backfill_toofull  2023-01-16T14:45:29.284223+0100
> 305298'141437      305741:877331    [60,607,96]          60    [77,607,96]
>             77      304802'141358  2023-01-08T21:39:23.469198+0100
> 304363'141349  2023-01-01T18:13:45.645494+0100              0
>
> > # ceph osd df:
> >  60    hdd   5.45999   1.00000  5.5 TiB  4.8 TiB   697 GiB  128 MiB
>  0 B   697 GiB  87.53  1.29   37      up
>
> In this situation was the correct way to address the problem?
> reweight-by-utilization the osd.60 to free up space (the OSD is a 6 TB
> disk, and other OSD on the same host are nearfull)? There is other way
> to manually map a PG to a different OSD?
>
> Thank you for your attention
>
> Iztok Gregori
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>

-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groÃƒ¼en Saal.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx