Re: Replacing a failed OSD

Jim Kilborn <jim@xxxxxxxxxxxx> · Wed, 14 Sep 2016 19:29:42 +0000

Reed,

Thanks for the response.

Your process is the one that I ran. However, I have a crushmap with ssd and sata drives in different buckets (host made up of host types, with and ssd and spinning hosttype for each host) because I am using ssd drives for a replicated cache in front of an erasure code data for cephfs.

I have “osd crush update on start = false” so that osds don’t randomly get added to the crush map, because it wouldn’t know where to put that osd.

I am using puppet to provision the drives when it sees one in a slot and it doesn’t see the ceph signature (I guess). I am using the ceph puppet module.

The real confusion is why I have to remove it from the crush map. Once I remove it from the crush map, it does bring it up as the same osd number, but its not in the crush map, so I have to put it back where it belongs. Just seems strange that it must be removed from the crush map.

Basically, I export the crush map, remove the osd from the crush map, then redeploy the drive. Then when it gets up and running as the same osd number, I import the exported crush map to get it back in the cluster.

I guess that is just how it has to be done.

Thanks again

Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10

From: Reed Dier<mailto:reed.dier@xxxxxxxxxxx>
Sent: Wednesday, September 14, 2016 1:39 PM
To: Jim Kilborn<mailto:jim@xxxxxxxxxxxx>
Cc: ceph-users@xxxxxxxxxxxxxx<mailto:ceph-users@xxxxxxxxxxxxxx>
Subject: Re:  Replacing a failed OSD

Hi Jim,

This is pretty fresh in my mind so hopefully I can help you out here.

Firstly, the crush map will back fill any holes in the enumeration that are existing. So assuming only one drive has been removed from the crush map, it will repopulate the same OSD number.

My steps for removing an OSD are run from the host node:

> ceph osd down osd.i
> ceph osd out osd.i
> stop ceph-osd id=i
> umount /var/lib/ceph/osd/ceph-i
> ceph osd crush remove osd.i
> ceph auth del osd.i
> ceph osd rm osd.i

>From here, the disk is removed from the ceph cluster, crush map, and is ready for removal and replacement.

>From there I deploy the new osd with ceph-deploy from my admin node using:

> ceph-deploy disk list nodei
> ceph-deploy disk zap nodei:sdX
> ceph-deploy --overwrite-conf osd prepare nodei:sdX

This will prepare the disk and insert it back into the crush map, bringing it back up and in. The OSD number should remain the same, as it will fill the gap left from the previous OSD removal.

Hopefully this helps,

Reed

> On Sep 14, 2016, at 11:00 AM, Jim Kilborn <jim@xxxxxxxxxxxx> wrote:
>
> I am finishing testing our new cephfs cluster and wanted to document a failed osd procedure.
> I noticed that when I pulled a drive, to simulate a failure, and run through the replacement steps, the osd has to be removed from the crushmap in order to initialize the new drive as the same osd number.
>
> Is this correct that I have to remove it from the crushmap, then after the osd is initialized, and mounted, add it back to the crush map? Is there no way to have it reuse the same osd # without removing if from the crush map?
>
> Thanks for taking the time….
>
>
> -          Jim
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com