Great, thanks for the update! Just yesterday I wanted to cleanup a
couple of test clusters and remove some old container images which
seemed to still be in use although several upgrades had been
processed. Those were quite old ceph-volume inventory processes,
dating back to the initial cluster bootstrap. But obviously, they
didn't have such an impact as you describe. Anyway, good to know that
it's not a major issue so I can upgrade our cluster as well. Although
I'm waiting for a PR that still didn't make it into latest pacific, so
maybe I'll wait for a bit longer.
Thanks!
Eugen
Zitat von Robert Sander <r.sander@xxxxxxxxxxxxxxxxxxx>:
On 8/16/23 12:10, Eugen Block wrote:
I don't really have a good idea right now, but there was a thread [1]
about ssh sessions that are not removed, maybe that could have such an
impact? And if you crank up the debug level to 30, do you see anything
else?
It was something similar. There were leftover ceph-volume processes
running on some of the OSD nodes. After killing them the cephadm
orchestrator is now able to resume the upgrade.
As we also restarted the MGR processes (with systemctl restart
CONTAINER) there were no leftover SSH sessions.
But the still running ceph-volume processes must have used a lock
that blocked new cephadm commands.
Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin
https://www.heinlein-support.de
Tel: 030 / 405051-43
Fax: 030 / 405051-19
Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx