Hello cephers, I know that there was similar question posted 5 years ago. However the answer was inconclusive for me. I installed a new Nautilus 14.2.1 cluster and started pre-production testing. I followed RedHat document and simulated a soft disk failure by # echo 1 > /sys/block/sdc/device/delete The cluster has been idle at the moment being new and all. I noticed some disk related errors in dmesg but that was about it. It looked to me for the next 20 - 30 minutes the failure has not been detected. All osds were up and in and health was OK. OSD logs had no smoking gun either. After 30 minutes, I restarted the OSD container and it failed to start as expected. Later on, I performed the same operation during the fio bench mark and OSD failed immediately. My question is: Should the disk problem have been detected quick enough even on the idle cluster? I thought Nautilus has the means to sense failure before intensive IO hit the disk. Am I wrong to expect that? _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com