Re: Schrödinger's Server

Frédéric Nass <frederic.nass@xxxxxxxxxxxxxxxx> · Thu, 27 Feb 2025 08:28:30 +0100 (CET)

----- Le 26 Fév 25, à 16:40, Tim Holloway timh@xxxxxxxxxxxxx a écrit :

> Thanks. I did resolve that problem, though I haven't had a chance to
> update until now.
> 
> I had already attempted to use ceph orch to remove the daemons, but
> they didn't succeed.
> 
> Fortunately, I was able to bring the host online, which allowed the
> scheduled removals to complete. I confirmed everything was drained,
> again removed the host from inventory and powered down.
> 
> Still got complaints from cephadm about the decommissioned host.
> 
> I took a break - impatience and ceph don't mix - and came back to
> address the next problem. which was lots of stuck PGs. Either because
> cephadm timed out or something kicked in when I started randomly
> rebooting OSDs. the host complaint finally disappeared. End of story.
> 
> Now for what sent me down that path.
> 
> I had 2 OSDs on one server and felt that that was probably not a good
> idea, so I marked one for deletion. 4 days later it was still in
> "destroying" state. More concerning, all signs indicated that despite
> having been reweighted to 0, the "destroying" OSD was still an
> essential participant and no indication that its PGs were being
> relocared to active servers. Shutting down the "destroying" OSD would
> immediately trigger a re-allocation panic, but that didn't clean
> anything. The re-allocation would proceed at a furious pace, then
> slowly stall out and hang, and the system was degraded. Restarting the
> OSD brought the PG inventory back up, but stuff still wasn't moving off
> the OSD,
> 
> Right about that time I decommissioned the questionable host.
> 
> Finally, I did a "ceph orch rm osd.x", and terminated the "destroying"
> permanently, making it finally disappear from the OSD tree list.
> 
> I also deleted a number of OSD pools that are (hopefully) not going to
> be missed.
> 
> Kicking and randomly repeatedly rebooting the other OSDs finally
> cleared all the stuck OSDs, some of which hadn't resolved in over 2
> days.
> 
> So at the moment, it's either rebalancing the cleaned-up OSDs or in a
> loop thinking that it is. 

Since you deleted some pools, it's probably the upmap balancer rebalancing PGs across the OSDs.

> And the PG/per-OSD count seems way too high,

How much is it right now? With what hardware?

> but the auto-sized doesn't seem to want to do anything about that.

If the PG autoscaler is enabled you could try adjusting per pool settings [1] and see if the # of PGs decreases.
If disabled you could manually reduce the number of PGs on the remaining pools to lower the PG/OSD ratio.

Regards,
Frédéric.

> 
> Of course, the whole shebang has been unavailable to clients this whole
> week because of that.
> 
> I've been considering upgrading to reef, but recent posts regarding
> issues resembling what I've been going through are making me pause.
> 
>  Again, thanks!
>    Tim
> 
> On Wed, 2025-02-26 at 13:57 +0100, Frédéric Nass wrote:
>> Hi Tim,
>> 
>> If you can't bring the host back online so that cephadm can remove
>> these services itself, I guess you'll have to clean up the mess by:
>> 
>> - removing these services from the cluster (for example with a 'ceph
>> mon remove {mon-id}' for the monitor)
>> - forcing their removal from the orchestrator with the --force option
>> on the commands 'ceph orch daemon rm <names>' and 'ceph orch host rm
>> <hostname>'. If the --force option doesn't help, then looking
>> into/editing/removing ceph-config keys like 'mgr/cephadm/inventory'
>> and 'mgr/cephadm/host.ceph07.internal.mousetech.com' that 'ceph
>> config-key dump' output shows might help.
>> 
>> Regards,
>> Frédéric.
>> 
>> ----- Le 25 Fév 25, à 16:42, Tim Holloway timh@xxxxxxxxxxxxx a écrit
>> :
>> 
>> > Ack. Another fine mess.
>> > 
>> > I was trying to clean things up and the process of tossing around
>> > OSD's
>> > kept getting me reports of slow responses and hanging PG
>> > operations.
>> > 
>> > This is Ceph Pacific, by the way.
>> > 
>> > I found a deprecated server that claimed to have an OSD even though
>> > it
>> > didn't show in either "ceph osd tree" or the dashboard OSD list. I
>> > suspect that a lot of the grief came from it attempting to use
>> > resources that weren't always seen as resources.
>> > 
>> > I shut down the server's OSD (removed the daemon using ceph orch),
>> > then
>> > foolishly deleted the server from the inventory without doing a
>> > drain
>> > first.
>> > 
>> > Now cephadmin hates me (key not found), and there are still an MDS
>> > and
>> > MON listed as ceph orch ls daemons even after I powered the host
>> > off.
>> > 
>> > I cannot do a ceph orch daemon delete because there's no longer an
>> > IP
>> > address available to the daemon delete, and I cannot clear the
>> > cephadmin queue:
>> > 
>> > [ERR] MGR_MODULE_ERROR: Module 'cephadm' has failed:
>> > 'ceph07.internal.mousetech.com'
>> > 
>> > Any suggestions?
>> > _______________________________________________
>> > ceph-users mailing list -- ceph-users@xxxxxxx
>> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx