Failed Disk simulation question

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello cephers,

I know that there was similar question posted 5 years ago.  However the answer was inconclusive for me.
I installed a new Nautilus 14.2.1 cluster and started pre-production testing.  I followed RedHat document and simulated a soft disk failure by

#  echo 1 > /sys/block/sdc/device/delete

The cluster has been idle at the moment being new and all.  I noticed some disk related errors in dmesg but that was about it.
It looked to me for the next 20 - 30 minutes the failure has not been detected.  All osds were up and in and health was OK. OSD logs had no smoking gun either.
After 30 minutes, I restarted the OSD container and it failed to start as expected.

Later on, I performed the same operation during the fio bench mark and OSD failed immediately.

My question is:  Should the disk problem have been detected quick enough even on the idle cluster? I thought Nautilus has the means to sense failure before intensive IO hit the disk.
Am I wrong to expect that?


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux