Re: cephadm orchestrator does not restart daemons [was: ceph orch upgrade stuck between 16.2.7 and 16.2.13]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I've seen this before where the ceph-volume process hanging causes the
whole serve loop to get stuck (we have a patch to get it to timeout
properly in reef and are backporting to quincy but nothing for pacific
unfortunately). That's why I was asking about the REFRESHED column in the
orch ps/ orch device ls output. Typically when this happens it presents as
the REFRESHED column reporting not having refreshed anything since the
ceph-volume process started hanging. Either way, if you killed those
ceph-volume processes and any new ones aren't hanging and the serve loop is
running okay I'd expect the issues to clear up. This could (and most likely
did) cause both the daemon restarts to not happen and the upgrade to not
progress.

On Wed, Aug 16, 2023 at 8:50 AM Robert Sander <r.sander@xxxxxxxxxxxxxxxxxxx>
wrote:

> On 8/16/23 12:10, Eugen Block wrote:
> > I don't really have a good idea right now, but there was a thread [1]
> > about ssh sessions that are not removed, maybe that could have such an
> > impact? And if you crank up the debug level to 30, do you see anything
> > else?
>
> It was something similar. There were leftover ceph-volume processes
> running on some of the OSD nodes. After killing them the cephadm
> orchestrator is now able to resume the upgrade.
>
> As we also restarted the MGR processes (with systemctl restart
> CONTAINER) there were no leftover SSH sessions.
>
> But the still running ceph-volume processes must have used a lock that
> blocked new cephadm commands.
>
> Regards
> --
> Robert Sander
> Heinlein Consulting GmbH
> Schwedter Str. 8/9b, 10119 Berlin
>
> https://www.heinlein-support.de
>
> Tel: 030 / 405051-43
> Fax: 030 / 405051-19
>
> Amtsgericht Berlin-Charlottenburg - HRB 220009 B
> Geschäftsführer: Peer Heinlein - Sitz: Berlin
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux