On 19/09/17 10:40, Wido den Hollander wrote: > >> Op 19 september 2017 om 10:24 schreef Adrian Saul <Adrian.Saul@xxxxxxxxxxxxxxxxx>: >> >> >>> I understand what you mean and it's indeed dangerous, but see: >>> https://github.com/ceph/ceph/blob/master/systemd/ceph-osd%40.service >>> >>> Looking at the systemd docs it's difficult though: >>> https://www.freedesktop.org/software/systemd/man/systemd.service.ht >>> ml >>> >>> If the OSD crashes due to another bug you do want it to restart. >>> >>> But for systemd it's not possible to see if the crash was due to a disk I/O- >>> error or a bug in the OSD itself or maybe the OOM-killer or something. >> >> Perhaps using something like RestartPreventExitStatus and defining a specific exit code for the OSD to exit on when it is exiting due to an IO error. >> > > That's a very, very good idea! I didn't know that one existed. > > That would prevent restarts in case of I/O error indeed. That would depend on the OSD gracefully handling the I/O failure - IME they quite often seem to end up abort()ing... Regards, Matthew -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com