In addition to the points that you made : I noticed on RAID0 disk that read IO errors are not always trapped by ceph leading to unattended behaviour of the impacted OSD daemon. On both RAID0 disk or non-RAID disk, a IO error is trapped on /var/log/messages Oct 2 15:20:37 os-ceph05 kernel: sd 0:1:0:7: [sdh] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE Oct 2 15:20:37 os-ceph05 kernel: sd 0:1:0:7: [sdh] tag#0 Sense Key : Medium Error [current] Oct 2 15:20:37 os-ceph05 kernel: sd 0:1:0:7: [sdh] tag#0 Add. Sense: Unrecovered read error Oct 2 15:20:37 os-ceph05 kernel: sd 0:1:0:7: [sdh] tag#0 CDB: Read(16) 88 00 00 00 00 00 00 00 37 e0 00 00 00 08 00 00 Oct 2 15:20:37 os-ceph05 kernel: blk_update_request: critical medium error, dev sdh, sector 14304 On non RAID0 disk, we can see the I/O errors in the OSD log 2017-09-27 00:55:52.100678 7faceba7b700 -1 filestore(/var/lib/ceph/osd/ceph-276) FileStore::read(9.103_head/#9:c086eeb2:::rbd_data.6592c12eb141f2.0000000000058795:head#) pread error: (5) Input/output error 2017-09-27 00:55:52.128147 7faceba7b700 -1 os/filestore/FileStore.cc: In function 'virtual int FileStore::read(const coll_t&, const ghobject_t&, uint64_t, size_t, ceph::bufferlist&, uint32_t, bool) ' thread 7faceba7b700 time 2017-09-27 00:55:52.101208 os/filestore/FileStore.cc: 3016: FAILED assert(0 == "eio on pread") On RAID0 disk, we only see thread timeout in the OSD log 2017-10-02 15:20:26.360683 7f3240154700 1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7f3250c3c700' had timed out after 15 2017-10-02 15:20:26.360729 7f3240053700 1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7f3250c3c700' had timed out after 15 2017-10-02 15:20:26.413488 7f323f144700 1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7f3250c3c700' had timed out after 15 2017-10-02 15:20:26.413574 7f323f043700 1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7f3250c3c700' had timed out after 15 2017-10-02 15:20:26.536500 7f323f548700 1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7f3250c3c700' had timed out after 15 On non RAID disk, a IO error, till jewel, will restart the OSD, making the inconsistent pg (and the others) peer to on other OSD with clean data On RAID0 disk, the IO error can lead to an increasing number of slow requests, blocking the all cluster if the load is high to just some slow requests and an IO error from the client view if the load is low. The RAID0s i'm talking about are built on a HP Smart Array P420i on a SL4540. It may be only related to this hardware _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com