Re: Ceph OSD purge doesn't work while rebalancing

Richard Bade <hitrich@xxxxxxxxx> · Wed, 27 Apr 2022 09:23:16 +1200

I agree that it would be better if it was less sensitive to unrelated
backfill. I've noticed this recently too, especially if you're purging
multiple osds (like a whole host). The first one succeeds but the next
one fails even though I have no rebalance set and the osd was already
out.
I guess if my process was to remove the osd from the crush and let it
rebalance (as compared to just setting it out) then there would be no
rebalance when it was purged.
This example doesn't cover your case when there's pre-existing backfill though.

Rich

On Tue, 26 Apr 2022 at 20:09, Benoît Knecht <bknecht@xxxxxxxxxxxxx> wrote:
>
> Hi Stefan,
>
> On Fri, Apr 22, 2022 at 11:13:36AM +0200, Stefan Kooman wrote:
> > On 4/22/22 09:25, Benoît Knecht wrote:
> > > We use the following procedure to remove an OSD from a Ceph cluster (to replace
> > > a defective disk for instance):
> > >
> > >    # ceph osd crush reweight 559 0
> > >    (Wait for the cluster to rebalance.)
> > >    # ceph osd out 559
> > >    # ceph osd ok-to-stop 559
> > >    # ceph osd safe-to-destroy 559
> > >    (Stop the OSD daemon.)
> > >    # ceph osd purge 559
> > >
> > > This works great when there's no rebalancing happening on the cluster, but if
> > > there is, the last step (ceph osd purge 559) fails with
> > >
> > >    # ceph osd purge 559
> > >    Error EAGAIN: OSD(s) 559 have no reported stats, and not all PGs are active+clean; we cannot draw any conclusions.
> > >    You can proceed by passing --force, but be warned that this will likely mean real, permanent data loss.
> > >
> > > But none of the PGs are degraded, so it isn't clear to me why Ceph thinks this
> > > is a risky operation. The only PGs that are not active+clean are
> > > active+remapped+backfill_wait or active+remapped+backfilling.
> > >
> > > Is the ceph osd purge command overly cautious, or am I overlooking an edge-case
> > > that could lead to data loss? I know I could use --force, but I don't want to
> > > override these safety checks if they're legitimate.
> >
> > To me this looks like Ceph being overly cautious. It appears to only
> > accept PGs in active+clean state. When you have not set "norebalance",
> > "norecover", "nobackfill" an out OSD should not have PGs mapped to it.
> >
> > Instead of purge you can do "ceph osd rm $id", "ceph osd auth rm $id"
> > and "ceph osd crush rm $id" ... but that's probably the same as using
> > "--force" with the purge command.
>
> Thanks for your feedback! I think I'll try to submit a patch for `ceph osd
> safe-to-destroy` to be a bit more permissive about acceptable PG states, as it
> would be quite convenient to be able to purge OSDs even if the cluster is
> rebalancing.
>
> Cheers,
>
> --
> Ben
>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx