Re: OSDs going down/up at random

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Have you checked your firewall?


From: ceph-users <ceph-users-bounces@xxxxxxxxxxxxxx> on behalf of Mike O'Connor <mike@xxxxxxxxxx>
Sent: Wednesday, 10 January 2018 3:40:30 PM
To: ceph-users@xxxxxxxxxxxxxx
Subject: OSDs going down/up at random
 
Hi All

I have a ceph host (12.2.2) which has 14 OSDs which seem to go down the
up, what should I look at to try to identify the issue ?
The system has three LSI SAS9201-8i cards which is then connected 14
drives at this time. (option of 24 drives)
I have three of these chassis but only one is running right now so I
have CEPH set for singe node.

I have very carefully looks at the logs files and not found anything
which indicates any issues with the controller and the drives.

dmesg has these messages.
-------------------
[78752.708932] libceph: osd3 https://protect-au.mimecast.com/s/LFVNCjZroMFp9VzNsWjI8v?domain=10.1.6.2 socket closed (con state OPEN)
[78752.710319] libceph: osd3 https://protect-au.mimecast.com/s/LFVNCjZroMFp9VzNsWjI8v?domain=10.1.6.2 socket closed (con state
CONNECTING)
[78753.426244] libceph: osd3 down
[78753.426640] libceph: osd3 down
[78776.496962] libceph: osd5 https://protect-au.mimecast.com/s/R74KCk8vpKsmNPDGIVKSth?domain=10.1.6.2 socket closed (con state OPEN)
[78776.498626] libceph: osd5 https://protect-au.mimecast.com/s/R74KCk8vpKsmNPDGIVKSth?domain=10.1.6.2 socket closed (con state
CONNECTING)
[78777.446384] libceph: osd5 down
[78777.446720] libceph: osd5 down
[78806.466973] libceph: osd3 up
[78806.467429] libceph: osd3 up
[78855.565098] libceph: osd10 https://protect-au.mimecast.com/s/oGLfClxwqLcg7jW4uysQxu?domain=10.1.6.2 socket closed (con state OPEN)
[78855.567062] libceph: osd10 https://protect-au.mimecast.com/s/oGLfClxwqLcg7jW4uysQxu?domain=10.1.6.2 socket closed (con state
CONNECTING)
[78856.554209] libceph: osd10 down
[78856.554357] libceph: osd10 down
[78868.265665] libceph: osd1 https://protect-au.mimecast.com/s/nOs0CmOxr6sE29V4u91lq9?domain=10.1.6.2 socket closed (con state OPEN)
[78868.266347] libceph: osd1 https://protect-au.mimecast.com/s/nOs0CmOxr6sE29V4u91lq9?domain=10.1.6.2 socket closed (con state
CONNECTING)
[78868.529575] libceph: osd1 down
[78869.469264] libceph: osd1 down
[78899.538533] libceph: osd10 up
[78899.538808] libceph: osd10 up
[78903.556418] libceph: osd5 up
[78905.309401] libceph: osd5 up
[78909.755499] libceph: osd1 up
[78912.008581] libceph: osd1 up
[78912.040872] libceph: osd4 https://protect-au.mimecast.com/s/nnh2CnxyvXcM5k4BFmrwSt?domain=10.1.6.2 socket error on write
[78924.736964] libceph: osd8 https://protect-au.mimecast.com/s/L-IXCoVzwKhx4QAWFoPBhf?domain=10.1.6.2 socket closed (con state OPEN)
[78924.738402] libceph: osd8 https://protect-au.mimecast.com/s/L-IXCoVzwKhx4QAWFoPBhf?domain=10.1.6.2 socket closed (con state
CONNECTING)
[78925.602597] libceph: osd8 down
[78925.602942] libceph: osd8 down
[78988.648108] libceph: osd8 up
[78988.648462] libceph: osd8 up
[79010.808917] libceph: osd4 https://protect-au.mimecast.com/s/nnh2CnxyvXcM5k4BFmrwSt?domain=10.1.6.2 socket closed (con state OPEN)
[79010.810722] libceph: osd4 https://protect-au.mimecast.com/s/nnh2CnxyvXcM5k4BFmrwSt?domain=10.1.6.2 socket closed (con state
CONNECTING)
[79011.617598] libceph: osd4 down
[79011.617861] libceph: osd4 down
[79072.772966] libceph: osd14 https://protect-au.mimecast.com/s/NQh9Cp8AxKsXD7g6T7hJXl?domain=10.1.6.2 socket closed (con state OPEN)
[79072.773434] libceph: osd14 https://protect-au.mimecast.com/s/NQh9Cp8AxKsXD7g6T7hJXl?domain=10.1.6.2 socket closed (con state OPEN)
[79072.774219] libceph: osd14 https://protect-au.mimecast.com/s/NQh9Cp8AxKsXD7g6T7hJXl?domain=10.1.6.2 socket closed (con state
CONNECTING)
[79073.657383] libceph: osd14 down
[79073.657552] libceph: osd14 down
[79082.565025] libceph: osd13 https://protect-au.mimecast.com/s/M8J8Cq7By5smWVgwIvPbXG?domain=10.1.6.2 socket closed (con state OPEN)
[79082.565814] libceph: osd13 https://protect-au.mimecast.com/s/M8J8Cq7By5smWVgwIvPbXG?domain=10.1.6.2 socket closed (con state OPEN)
[79082.566279] libceph: osd13 https://protect-au.mimecast.com/s/M8J8Cq7By5smWVgwIvPbXG?domain=10.1.6.2 socket closed (con state
CONNECTING)
[79082.670861] libceph: osd13 down
[79082.671023] libceph: osd13 down
[79115.435180] libceph: osd14 up
[79115.435989] libceph: osd14 up
[79117.603991] libceph: osd13 up
[79118.557601] libceph: osd13 up
[79154.719547] libceph: osd4 up
[79154.720232] libceph: osd4 up
[79175.900935] libceph: osd12 https://protect-au.mimecast.com/s/OyQKCr8Dz5sL9WyJULA4u3?domain=10.1.6.2 socket closed (con state OPEN)
[79175.902922] libceph: osd12 https://protect-au.mimecast.com/s/OyQKCr8Dz5sL9WyJULA4u3?domain=10.1.6.2 socket closed (con state
CONNECTING)
[79176.650847] libceph: osd12 down
[79176.651138] libceph: osd12 down
[79219.762665] libceph: osd12 up
[79219.763090] libceph: osd12 up
[79252.405666] libceph: osd11 https://protect-au.mimecast.com/s/3gV2Cvl0E5uMXRJVFEb6QV?domain=10.1.6.2 socket closed (con state OPEN)
[79252.406349] libceph: osd11 https://protect-au.mimecast.com/s/3gV2Cvl0E5uMXRJVFEb6QV?domain=10.1.6.2 socket closed (con state
CONNECTING)
[79252.462748] libceph: osd11 down
[79252.462855] libceph: osd11 down
[79285.656850] libceph: osd11 up
[79285.657341] libceph: osd11 up
[80558.024975] libceph: osd13 https://protect-au.mimecast.com/s/NQh9Cp8AxKsXD7g6T7hJXl?domain=10.1.6.2 socket closed (con state OPEN)
[80558.025751] libceph: osd13 https://protect-au.mimecast.com/s/NQh9Cp8AxKsXD7g6T7hJXl?domain=10.1.6.2 socket closed (con state OPEN)
[80558.026341] libceph: osd13 https://protect-au.mimecast.com/s/NQh9Cp8AxKsXD7g6T7hJXl?domain=10.1.6.2 socket closed (con state
CONNECTING)
[80558.652903] libceph: osd13 https://protect-au.mimecast.com/s/NQh9Cp8AxKsXD7g6T7hJXl?domain=10.1.6.2 socket error on write
[80558.734330] libceph: osd13 down
[80558.734501] libceph: osd13 down
[80590.753493] libceph: osd13 up
[80592.884936] libceph: osd13 up
[80592.897062] libceph: osd12 https://protect-au.mimecast.com/s/OyQKCr8Dz5sL9WyJULA4u3?domain=10.1.6.2 socket closed (con state OPEN)
[90351.841800] libceph: osd1 down
[90371.299988] libceph: osd1 down
[90391.238370] libceph: osd1 up
[90391.778979] libceph: osd1 up

Thanks for any help/ideas
Mike
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux