Hi again, it seems the reason is either a corrupted file-system or disk-error. Correlating with the last daemon crash we saw i/o error messages from the kernel for the specific disk, for whatever reason there where no such errors previously. I did an xfs_repair, which showed no errors, but at least the daemon didn't crash after a few minutes, like the last times. At least it seems to be nothing 'ceph related', but a simple disk or fs error. Best regards, Kurt > Kurt Bauer <mailto:kurt.bauer@xxxxxxxxxxxx> > 25. April 2014 09:13 > Hi, > > since a few days we have one OSD (osd.23) permanently crashing. It can > be restarted without problem but after a few minutes (one time it lived > for a few hours) it crashes. The logfile of the specific osd for one of > those crashes is attached. Other OSDs on the same host work without > problem. > I tried a xfs_check on the specific disk, which showed no errors. > > Maybe you can give us a hint where the problem could be and what has to > be done to solve it. If any further information or logs would help, I'd > be happy to provide them. > > Many thanks, > best regards, > Kurt > > PS.: Maybe the outline of the cluster could be helpful: > vmstor1:~# ceph osd tree > # id weight type name up/down reweight > -1 78 root default > -13 52 city Vienna > -11 26 datacenter NIG > -8 26 room B1 > -3 26 rack B2 > -2 26 host vmstor1 > 0 2 osd.0 > up 1 > 1 2 osd.1 > up 1 > 2 2 osd.2 > up 1 > 3 2 osd.3 > up 1 > 12 3 osd.12 > up 1 > 14 3 osd.14 > up 1 > 15 3 osd.15 > up 1 > 16 3 osd.16 > up 1 > 17 3 osd.17 > up 1 > 18 3 osd.18 > up 1 > -12 26 datacenter IXION > -10 26 room 3 > -7 26 rack R0 > -4 26 host vmstor21 > 10 2 osd.10 > up 1 > 11 2 osd.11 > up 1 > 8 2 osd.8 > up 1 > 9 2 osd.9 > up 1 > 13 3 osd.13 > up 1 > 25 3 osd.25 > up 1 > 26 3 osd.26 > up 1 > 27 3 osd.27 > up 1 > 28 3 osd.28 > up 1 > 29 3 osd.29 > up 1 > -15 26 city Linz > -14 26 datacenter OOE > -9 26 room SYS2 > -6 26 rack LIK1N > -5 26 host vmstor2 > 4 2 osd.4 > up 1 > 5 2 osd.5 > up 1 > 6 2 osd.6 > up 1 > 7 2 osd.7 > up 1 > 19 3 osd.19 > up 1 > 20 3 osd.20 > up 1 > 21 3 osd.21 > up 1 > 22 3 osd.22 > up 1 > 23 3 osd.23 > down 0 > 24 3 osd.24 > up 1 > >
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature