Hi David, On Fri, Nov 15, 2013 at 10:00:37AM -0800, David Zafman wrote: > > Replication does not occur until the OSD is “out.” This creates a new mapping in the cluster of where the PGs should be and thus data begins to move and/or create sufficient copies. This scheme lets you control how and when you want the replication to occur. If you have plenty of space and you aren’t going to replace the drive immediately, just mark the OSD “down" AND “out.". If you are going to replace the drive immediately, set the “noout” flag. Take the OSD “down” and replace drive. Assuming it is mounted in the same place as the bad drive, bring the OSD back up. This will replicate exactly the same PGs the bad drive held back to the replacement drive. As was stated before don’t forget to “ceph osd unset noout" > > Keep in mind that in the case of a machine that has a hardware failure and takes OSD(s) down there is an automatic timeout which will mark them “out" for unattended operation. Unless you are monitoring the cluster 24/7 you should have enough disk space available to handle failures. > > Related info in: > > http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-osd/ > > David Zafman > Senior Developer > http://www.inktank.com Are you saying, if a disk suffers from a bad sector in an object for which it's primary, and for which good data exists on other replica PGs, there's no way for ceph to recover other than by (re-)replicating the whole disk? I.e., even if the disk is able to remap the bad sector using a spare, so the disk is ok (albeit missing a sector's worth of object data), the only way to recover is to basically blow away all the data on that disk and start again, replicating everything back to the disk (or to other disks)? Cheers, Chris. _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com