Hi,
I suggest to increase the debug level for a single OSD and then
inspect the log. Maybe there's a hint pointing to
osd_map_share_max_epochs as well. I assume that you had noout set
while the OSDs were out for a long time?
Zitat von Jorge Garcia <jgarcia@xxxxxxxxxxxx>:
Hello,
I'm going down the long and winding road of upgrading our ceph
clusters from mimic to the latest version. This has involved slowly
going up one release at a time. I'm now going from octopus to pacific,
which also involves upgrading the OS on the host systems from Centos 7
to Rocky 9.
I first upgraded the monitors and managers, and those upgraded with no
problems. Now I'm upgrading the OSD servers, and I ran into some
issues that caused the first system to be down for a couple of days. I
finally got it back up, and got all the OSDs ready to come back
online, but whenever I try to bring the OSDs back up, they start
running for a bit, and it looks like the cluster is recovering and
catching up, but then the OSDs all go down again. The logs show some
messages like:
received signal: Interrupt from Kernel ( Could be generated by
pthread_kill(), raise(), abort(), alarm() ) UID: 0
osd.10 254568 *** Got signal Interrupt ***
osd.10 254568 *** Immediate shutdown (osd_fast_shutdown=true) ***
osd.10 254568 prepare_to_stop starting shutdown
I found this thread:
https://www.spinics.net/lists/ceph-users/msg75628.html which seems to
be something similar, and they claim that the cluster needs to be
restarted many times in order for the OSDs to catch up to the current
epoch. I have restarted the OSDs many times, and now it's gotten to a
spot where there doesn't seem to be any progress. My questions are:
Is this the right solution?
Is there a way of seeing if some progress is happening with the OSDs?
Is there something else I should be trying?
Thanks for any help!
Jorge
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx