OSD trashed by simple reboot (Debian Jessie, systemd?)

Christian Balzer <chibi@xxxxxxx> · Fri, 5 Dec 2014 12:03:56 +0900

Hello,

This morning I decided to reboot a storage node (Debian Jessie, thus 3.16
kernel and Ceph 0.80.7, HDD OSDs with SSD journals) after applying some
changes. 

It came back up one OSD short, the last log lines before the reboot are:
---
2014-12-05 09:35:27.700330 7f87e789c700  2 -- 10.0.8.21:6823/29520 >> 10.0.8.22:0/5161 pipe(0x7f881b772580 sd=247 :6823 s=2 pgs=21 cs=1 l=1 c=0x7f881f469020).fault (0) Success
2014-12-05 09:35:27.700350 7f87f011d700 10 osd.4 pg_epoch: 293 pg[3.316( v 289'1347 (0'0,289'1347] local-les=289 n=8 ec=5 les/c 289/289 288/288/288) [8,4,16] r=1 lpr=288 pi=276-287/1 luod=0'0 crt=289'1345 lcod 289'1346 active] cancel_copy_ops
---

Quite obviously it didn't complete its shutdown, so unsurprisingly we get:
---
2014-12-05 09:37:40.278128 7f218a7037c0  1 journal _open /var/lib/ceph/osd/ceph-4/journal fd 24: 10000269312 bytes, block size 4096 bytes, directio = 1, aio = 1
2014-12-05 09:37:40.278427 7f218a7037c0 -1 journal read_header error decoding journal header
2014-12-05 09:37:40.278479 7f218a7037c0 -1 filestore(/var/lib/ceph/osd/ceph-4) mount failed to open journal /var/lib/ceph/osd/ceph-4/journal: (22) Invalid argument
2014-12-05 09:37:40.776203 7f218a7037c0 -1 osd.4 0 OSD:init: unable to mount object store
2014-12-05 09:37:40.776223 7f218a7037c0 -1 ESC[0;31m ** ERROR: osd init failed: (22) Invalid argument
ESC[0m
---

Thankfully this isn't production yet and I was eventually able to recover
the OSD by re-creating the journal ("ceph-osd -i 4 --mkjournal"), but it
leaves me with a rather bad taste in my mouth.

So the pertinent questions would be:

1. What caused this? 
My bet is on the evil systemd just pulling the plug before the poor OSD
had finished its shutdown job. 

2. How to prevent it from happening again?
Is there something the Ceph developers can do with regards to init scripts?
Or is this something to be brought up with the Debian maintainer?
Debian is transiting from sysv-init to systemd (booo!) with Jessie, but
the OSDs still have a sysvinit magic file in their top directory. Could
this have an affect on things?

3. Is it really that easy to trash your OSDs?
In the case a storage node crashes, am I to expect most if not all OSDs or
at least their journals to require manual loving?

Regards,

Christian
-- 
Christian Balzer        Network/Systems Engineer                
chibi@xxxxxxx   	Global OnLine Japan/Fusion Communications
http://www.gol.com/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com