Hi Torkil, Possible that you are hitting balancer issues on 19.2.0 for clusters with larger pg numbers: https://tracker.ceph.com/issues/68657 Try turning it off with ceph balancer off Best, Laimis J. > On 17 Dec 2024, at 13:15, Torkil Svensgaard <torkil@xxxxxxxx> wrote: > > > > On 17/12/2024 12:05, Torkil Svensgaard wrote: >> Hi >> Running upgrade from 18.2.4 to 19.2.0 and it managed to upgrade the managers but no further progress. > > Now it actually seems to have upgraded 1 MON now then the orchestrator crashed again: > > " > { > "mon": { > "ceph version 18.2.4 (e7ad5345525c7aa95470c26863873b581076945d) reef (stable)": 4, > "ceph version 19.2.0 (16063ff2022298c9300e49a547a16ffda59baf13) squid (stable)": 1 > }, > "mgr": { > "ceph version 19.2.0 (16063ff2022298c9300e49a547a16ffda59baf13) squid (stable)": 3 > }, > "osd": { > "ceph version 18.2.4 (e7ad5345525c7aa95470c26863873b581076945d) reef (stable)": 548 > }, > "mds": { > "ceph version 18.2.4 (e7ad5345525c7aa95470c26863873b581076945d) reef (stable)": 3 > }, > "overall": { > "ceph version 18.2.4 (e7ad5345525c7aa95470c26863873b581076945d) reef (stable)": 555, > "ceph version 19.2.0 (16063ff2022298c9300e49a547a16ffda59baf13) squid (stable)": 4 > } > } > " > > Mvh. > > Torkil > > >> If I fail over the mgr it goes: >> " >> [root@ceph-flash1 ~]# ceph orch upgrade status >> Error ENOTSUP: Module 'orchestrator' is not enabled/loaded (required by command 'orch upgrade status'): use `ceph mgr module enable orchestrator` to enable it >> " >> From mgr log: >> " >> ... >> 2024-12-17T10:43:11.729+0000 7f70efafe640 0 log_channel(audit) log [DBG] : from='client.2110010386 -' entity='client.admin' cmd=[{"prefix": "orch upgrade status", "target": ["mon-mgr", ""]}]: dispatch >> 2024-12-17T10:43:11.733+0000 7f70ebaf6640 0 [cephadm INFO cherrypy.error] [17/Dec/2024:10:43:11] ENGINE Bus STARTING >> 2024-12-17T10:43:11.733+0000 7f70ebaf6640 0 log_channel(cephadm) log [INF] : [17/Dec/2024:10:43:11] ENGINE Bus STARTING >> 2024-12-17T10:43:11.811+0000 7f70e7aee640 0 [dashboard INFO dashboard.module] Engine started. >> 2024-12-17T10:43:11.861+0000 7f70ebaf6640 0 [cephadm INFO cherrypy.error] [17/Dec/2024:10:43:11] ENGINE Serving on https://www.google.com/url?q=https://172.21.15.148:7150&source=gmail-imap&ust=1735039047000000&usg=AOvVaw3LyWY24vMZA-AbVVOsv3Z9 >> 2024-12-17T10:43:11.861+0000 7f70ebaf6640 0 log_channel(cephadm) log [INF] : [17/Dec/2024:10:43:11] ENGINE Serving on https://www.google.com/url?q=https://172.21.15.148:7150&source=gmail-imap&ust=1735039047000000&usg=AOvVaw3LyWY24vMZA-AbVVOsv3Z9 >> 2024-12-17T10:43:11.864+0000 7f70a2d7a640 0 [cephadm ERROR cherrypy.error] [17/Dec/2024:10:43:11] ENGINE Error in HTTPServer.serve >> Traceback (most recent call last): >> File "/lib/python3.9/site-packages/cheroot/server.py", line 1823, in serve >> self._connections.run(self.expiration_interval) >> File "/lib/python3.9/site-packages/cheroot/connections.py", line 203, in run >> self._run(expiration_interval) >> File "/lib/python3.9/site-packages/cheroot/connections.py", line 246, in _run >> new_conn = self._from_server_socket(self.server.socket) >> File "/lib/python3.9/site-packages/cheroot/connections.py", line 300, in _from_server_socket >> s, ssl_env = self.server.ssl_adapter.wrap(s) >> File "/lib/python3.9/site-packages/cheroot/ssl/builtin.py", line 277, in wrap >> s = self.context.wrap_socket( >> File "/lib64/python3.9/ssl.py", line 501, in wrap_socket >> return self.sslsocket_class._create( >> File "/lib64/python3.9/ssl.py", line 1074, in _create >> self.do_handshake() >> File "/lib64/python3.9/ssl.py", line 1343, in do_handshake >> self._sslobj.do_handshake() >> ssl.SSLZeroReturnError: TLS/SSL connection has been closed (EOF) (_ssl.c:1133) >> 2024-12-17T10:43:11.865+0000 7f70a2d7a640 -1 log_channel(cephadm) log [ERR] : [17/Dec/2024:10:43:11] ENGINE Error in HTTPServer.serve >> Traceback (most recent call last): >> File "/lib/python3.9/site-packages/cheroot/server.py", line 1823, in serve >> self._connections.run(self.expiration_interval) >> File "/lib/python3.9/site-packages/cheroot/connections.py", line 203, in run >> self._run(expiration_interval) >> File "/lib/python3.9/site-packages/cheroot/connections.py", line 246, in _run >> new_conn = self._from_server_socket(self.server.socket) >> File "/lib/python3.9/site-packages/cheroot/connections.py", line 300, in _from_server_socket >> s, ssl_env = self.server.ssl_adapter.wrap(s) >> File "/lib/python3.9/site-packages/cheroot/ssl/builtin.py", line 277, in wrap >> s = self.context.wrap_socket( >> File "/lib64/python3.9/ssl.py", line 501, in wrap_socket >> return self.sslsocket_class._create( >> File "/lib64/python3.9/ssl.py", line 1074, in _create >> self.do_handshake() >> File "/lib64/python3.9/ssl.py", line 1343, in do_handshake >> self._sslobj.do_handshake() >> ssl.SSLZeroReturnError: TLS/SSL connection has been closed (EOF) (_ssl.c:1133) >> 2024-12-17T10:43:11.963+0000 7f70ebaf6640 0 [cephadm INFO cherrypy.error] [17/Dec/2024:10:43:11] ENGINE Serving on https://www.google.com/url?q=http://172.21.15.148:8765&source=gmail-imap&ust=1735039047000000&usg=AOvVaw1D05c8loKEwXnozNdlOMpU >> 2024-12-17T10:43:11.963+0000 7f70ebaf6640 0 log_channel(cephadm) log [INF] : [17/Dec/2024:10:43:11] ENGINE Serving on https://www.google.com/url?q=http://172.21.15.148:8765&source=gmail-imap&ust=1735039047000000&usg=AOvVaw1D05c8loKEwXnozNdlOMpU >> 2024-12-17T10:43:11.963+0000 7f70ebaf6640 0 [cephadm INFO cherrypy.error] [17/Dec/2024:10:43:11] ENGINE Bus STARTED >> 2024-12-17T10:43:11.964+0000 7f70ebaf6640 0 log_channel(cephadm) log [INF] : [17/Dec/2024:10:43:11] ENGINE Bus STARTED >> ... >> " >> It will recover after some timeout, maybe 5-10 mins, and then just sit there with no upgrade progress. >> Nothing in mgr/cephadm/osd_remove_queue. >> Suggestions? >> Mvh. >> Torkil > > -- > Torkil Svensgaard > Sysadmin > MR-Forskningssektionen, afs. 714 > DRCMR, Danish Research Centre for Magnetic Resonance > Hvidovre Hospital > Kettegård Allé 30 > DK-2650 Hvidovre > Denmark > Tel: +45 386 22828 > E-mail: torkil@xxxxxxxx > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx