Re: Schrödinger's Server

Frédéric Nass <frederic.nass@xxxxxxxxxxxxxxxx> · Thu, 27 Feb 2025 08:33:34 +0100 (CET)

----- Le 27 Fév 25, à 8:28, Frédéric Nass frederic.nass@xxxxxxxxxxxxxxxx a écrit :

> ----- Le 26 Fév 25, à 16:40, Tim Holloway timh@xxxxxxxxxxxxx a écrit :
> 
>> Thanks. I did resolve that problem, though I haven't had a chance to
>> update until now.
>> 
>> I had already attempted to use ceph orch to remove the daemons, but
>> they didn't succeed.
>> 
>> Fortunately, I was able to bring the host online, which allowed the
>> scheduled removals to complete. I confirmed everything was drained,
>> again removed the host from inventory and powered down.
>> 
>> Still got complaints from cephadm about the decommissioned host.
>> 
>> I took a break - impatience and ceph don't mix - and came back to
>> address the next problem. which was lots of stuck PGs. Either because
>> cephadm timed out or something kicked in when I started randomly
>> rebooting OSDs. the host complaint finally disappeared. End of story.
>> 
>> Now for what sent me down that path.
>> 
>> I had 2 OSDs on one server and felt that that was probably not a good
>> idea, so I marked one for deletion. 4 days later it was still in
>> "destroying" state. More concerning, all signs indicated that despite
>> having been reweighted to 0, the "destroying" OSD was still an
>> essential participant and no indication that its PGs were being
>> relocared to active servers. Shutting down the "destroying" OSD would
>> immediately trigger a re-allocation panic, but that didn't clean
>> anything. The re-allocation would proceed at a furious pace, then
>> slowly stall out and hang, and the system was degraded. Restarting the
>> OSD brought the PG inventory back up, but stuff still wasn't moving off
>> the OSD,
>> 
>> Right about that time I decommissioned the questionable host.
>> 
>> Finally, I did a "ceph orch rm osd.x", and terminated the "destroying"
>> permanently, making it finally disappear from the OSD tree list.
>> 
>> I also deleted a number of OSD pools that are (hopefully) not going to
>> be missed.
>> 
>> Kicking and randomly repeatedly rebooting the other OSDs finally
>> cleared all the stuck OSDs, some of which hadn't resolved in over 2
>> days.
>> 
>> So at the moment, it's either rebalancing the cleaned-up OSDs or in a
>> loop thinking that it is.
> 
> Since you deleted some pools, it's probably the upmap balancer rebalancing PGs
> across the OSDs.
> 
>> And the PG/per-OSD count seems way too high,
> 
> How much is it right now? With what hardware?
> 
>> but the auto-sized doesn't seem to want to do anything about that.
> 
> If the PG autoscaler is enabled you could try adjusting per pool settings [1]
> and see if the # of PGs decreases.
> If disabled you could manually reduce the number of PGs on the remaining pools
> to lower the PG/OSD ratio.
> 
> Regards,
> Frédéric.

[1] https://docs.ceph.com/en/latest/rados/operations/placement-groups/

Regards,
Frédéric.

> 
>> 
>> Of course, the whole shebang has been unavailable to clients this whole
>> week because of that.
>> 
>> I've been considering upgrading to reef, but recent posts regarding
>> issues resembling what I've been going through are making me pause.
>> 
>>  Again, thanks!
>>    Tim
>> 
>> On Wed, 2025-02-26 at 13:57 +0100, Frédéric Nass wrote:
>>> Hi Tim,
>>> 
>>> If you can't bring the host back online so that cephadm can remove
>>> these services itself, I guess you'll have to clean up the mess by:
>>> 
>>> - removing these services from the cluster (for example with a 'ceph
>>> mon remove {mon-id}' for the monitor)
>>> - forcing their removal from the orchestrator with the --force option
>>> on the commands 'ceph orch daemon rm <names>' and 'ceph orch host rm
>>> <hostname>'. If the --force option doesn't help, then looking
>>> into/editing/removing ceph-config keys like 'mgr/cephadm/inventory'
>>> and 'mgr/cephadm/host.ceph07.internal.mousetech.com' that 'ceph
>>> config-key dump' output shows might help.
>>> 
>>> Regards,
>>> Frédéric.
>>> 
>>> ----- Le 25 Fév 25, à 16:42, Tim Holloway timh@xxxxxxxxxxxxx a écrit
>>> :
>>> 
>>> > Ack. Another fine mess.
>>> > 
>>> > I was trying to clean things up and the process of tossing around
>>> > OSD's
>>> > kept getting me reports of slow responses and hanging PG
>>> > operations.
>>> > 
>>> > This is Ceph Pacific, by the way.
>>> > 
>>> > I found a deprecated server that claimed to have an OSD even though
>>> > it
>>> > didn't show in either "ceph osd tree" or the dashboard OSD list. I
>>> > suspect that a lot of the grief came from it attempting to use
>>> > resources that weren't always seen as resources.
>>> > 
>>> > I shut down the server's OSD (removed the daemon using ceph orch),
>>> > then
>>> > foolishly deleted the server from the inventory without doing a
>>> > drain
>>> > first.
>>> > 
>>> > Now cephadmin hates me (key not found), and there are still an MDS
>>> > and
>>> > MON listed as ceph orch ls daemons even after I powered the host
>>> > off.
>>> > 
>>> > I cannot do a ceph orch daemon delete because there's no longer an
>>> > IP
>>> > address available to the daemon delete, and I cannot clear the
>>> > cephadmin queue:
>>> > 
>>> > [ERR] MGR_MODULE_ERROR: Module 'cephadm' has failed:
>>> > 'ceph07.internal.mousetech.com'
>>> > 
>>> > Any suggestions?
>>> > _______________________________________________
>>> > ceph-users mailing list -- ceph-users@xxxxxxx
>>> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>> _______________________________________________
>>> ceph-users mailing list -- ceph-users@xxxxxxx
>> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx