On 04/22/2015 07:35 PM, Gregory Farnum wrote:
On Wed, Apr 22, 2015 at 8:17 AM, Kenneth Waegeman
<kenneth.waegeman@xxxxxxxx> wrote:
Hi,
I changed the cluster network parameter in the config files, restarted the
monitors , and then restarted all the OSDs (shouldn't have done that).
Do you mean that you changed the IP addresses of the monitors in the
config files everywhere, and then tried to restart things? Or
something else?
I only changed the value of the cluster network to a different one then
the public network
Now
the OSDS keep on crashing, and the cluster is not able to restore.. I
eventually rebooted the whole cluster, but the problem remains: For a moment
all 280 OSDs are up, and then they start crashing rapidly until there are
only less than 100 left (and eventually 30 or so).
Are the OSDs actually crashing, or are they getting shut down? If
they're crashing, can you please provide the actual backtrace? The
logs you're including below are all fairly low level and generally
don't even mean something has to be wrong.
It seems I did not tested the network throughfully enough, there was one
host that was unable to connect to the cluster network, only the public
network. I've found this out after all but the osds of that host came up
after a few hours. I fixed the network issue and all was fine (only a
few peering problems, but a restart of those osds blocking was sufficient)
There were no backtraces, and indeed I found out there were some
shutdown messages in the logs.
So it is all fixed now, but is it explainable that at first about 90% of
the OSDS going into shutdown over and over, and only after some time got
in a stable situation, because of one host network failure ?
Thanks again!
In the log files I see different kind of messages: Some OSDs have:
<snip>
I tested the network, the hosts can reach one another on both networks..
What configurations did you test?
14 hosts with each 16 keyvalue osds , 2 replicated cache partitions and
metadata partitions on 2 SSDs for cephfs.
-Greg
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com