Re: Rebuild OSD's

Craig Lewis <clewis@xxxxxxxxxxxxxxxxxx> · Tue, 2 Dec 2014 13:28:59 -0800

You have a total of 2 OSDs, and 2 disks, right?

The safe method is to mark one OSD out, and wait for the cluster to heal.  Delete, reformat, add it back to the cluster, and wait for the cluster to heal.  Repeat.  But that only works when you have enough OSDs that the cluster can heal.

So you'll have to go the less safe route, and hope you don't suffer a failure in the middle.  I went 
this route, because it was taking too long to do the safe route:

First, setup your ceph.conf with the new osd options.  osd mkfs *, osd journal *, whatever you want the OSDs to look like when you're done.  You may want to set osd max backfills to 1 before you start.  The default value of 10 is really only a good idea if you have a large cluster and SSD journals.

Remove the disk, format, and put it back in:
ceph osd set norecover
ceph osd set nobackfill
ceph osd out $OSDID
sleep 30
stop ceph-osd id=$OSDID
ceph osd crush remove osd.$OSDID
ceph osd lost $OSDID --yes-i-really-mean-it
ceph auth del osd.$OSDID
ceph osd rm $OSDID
ceph-disk-prepare --zap $dev $journal    # ceph-deploy would also work
ceph osd unset norecover
ceph osd unset nobackfill
Wait for the cluster to heal, then repeat.  

It's more complicated if you have multiple devices in the zpool and you're using more than a small percentage of the disk space.

On Sat, Nov 29, 2014 at 2:29 PM, Lindsay Mathieson <lindsay.mathieson@xxxxxxxxx> wrote:
I have 2 OSD's on two nodes top of zfs that I'd like to rebuild in a more

standard (xfs) setup.

Would the following be a non destructive if somewhat tedious way of doing so?

Following the instructions from here:

  http://ceph.com/docs/master/rados/operations/add-or-rm-osds/#removing-osds-manual

1. Remove osd.0

2. Recreate osd.0

3. Add. osd.0

4. Wait for health to be restored

    i.e all data be copied from osd.1 to osd.0

5. Remove osd.1

6. Recreate osd.1

7. Add. osd.1

8. Wait for health to be restored

    i.e all data be copied from osd.0 to osd.1

9. Profit!

There's 1TB of data total. I can do this after hours while the system &

network is not being used

I do have complete backups in case it all goes pear shaped.

thanks,

--

Lindsay
_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com