Hello, If we go by the subject line, your data is still all there and valid (or at least mostly valid). Also, is that an actual RAID0, with multiple drives? If so, why? That just massively increases your failure probabilities AND the amount of affected data when it fails. Anyway, if that OSD is still working: 1. noout 2. stop osd 3. copy the data 100% off (dd, cp -a, rsync -a) 4. replace disk(s) 5. copy the data back in 6. start osd 7. unset noout Christian On Mon, 15 Aug 2016 02:50:31 +0000 David Turner wrote: > If you are trying to reduce extra data movement, set and unset the nobackfill and norecover flags when you do the same for noout. You will want to follow the instructions to fully remove the osd from the cluster including outing the osd, removing it from the crush map, removing it's auth from the cluster, and finally remove the osd from the cluster. After that, adding the osd back in should give it the same osd id that the former one had. If you make sure that the id is the same and the weight in the crush map is the same (you can do this by saving your crush map before you remove the osd and uploading the same crush map after you add it back in with the same id) then the only data movement will be onto the re-added osd and nothing else. > > ________________________________ > > [cid:image84bd5c.JPG@2e892687.44af8e6f]<https://storagecraft.com> David Turner | Cloud Operations Engineer | StorageCraft Technology Corporation<https://storagecraft.com> > 380 Data Drive Suite 300 | Draper | Utah | 84020 > Office: 801.871.2760 | Mobile: 385.224.2943 > > ________________________________ > > If you are not the intended recipient of this message or received it erroneously, please notify the sender and delete it, together with any attachments, and be advised that any dissemination or copying of this message is prohibited. > > ________________________________ > > ________________________________________ > From: ceph-users [ceph-users-bounces@xxxxxxxxxxxxxx] on behalf of Goncalo Borges [goncalo.borges@xxxxxxxxxxxxx] > Sent: Sunday, August 14, 2016 5:47 AM > To: ceph-users@xxxxxxxx > Subject: Substitute a predicted failure (not yet failed) osd > > Hi cephfers > > I have a really simple question: the documentation always refers to the procedure to substitute failed disks. Currently I have a predicted failure in a raid 0 osd and I would like to substitute before it fails without having to go by replicating pgs once the osd is removed from crush map, and then, replicating again once I add the new drive. > > Can I perform the following actions safely to achieve my goal? > > # ceph osd set noout > # stop the osd > # unmount the osd > # remove it from crush map > # substitute the drive > # recreate the osd > # ceph osd unset noout > > Cheers > Goncalo > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Christian Balzer Network/Systems Engineer chibi@xxxxxxx Global OnLine Japan/Rakuten Communications http://www.gol.com/ _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com