Re: Ceph Maintenance

Mike Jacobacci <mikej@xxxxxxxxxx> · Tue, 29 Nov 2016 16:19:20 -0800

I was able to bring the osd's up by looking at my other OSD node which is the exact same hardware/disks and finding out which disks map.  But I still cant bring up any of the start ceph-disk@dev-sd* services... When I first installed the cluster and got the OSD's up, I had to run the following:
# sgdisk -t 1:45b0969e-9b03-4f30-b4c6-b4b80ceff106 /dev/sdb
# sgdisk -t 2:45b0969e-9b03-4f30-b4c6-b4b80ceff106 /dev/sdb
# sgdisk -t 3:45b0969e-9b03-4f30-b4c6-b4b80ceff106 /dev/sdb
# sgdisk -t 4:45b0969e-9b03-4f30-b4c6-b4b80ceff106 /dev/sdb
# sgdisk -t 5:45b0969e-9b03-4f30-b4c6-b4b80ceff106 /dev/sdb
# sgdisk -t 1:45b0969e-9b03-4f30-b4c6-b4b80ceff106 /dev/sdc
# sgdisk -t 2:45b0969e-9b03-4f30-b4c6-b4b80ceff106 /dev/sdc
# sgdisk -t 3:45b0969e-9b03-4f30-b4c6-b4b80ceff106 /dev/sdc
# sgdisk -t 4:45b0969e-9b03-4f30-b4c6-b4b80ceff106 /dev/sdc
# sgdisk -t 5:45b0969e-9b03-4f30-b4c6-b4b80ceff106 /dev/sdc

Do i need to run that again?

Cheers,
Mike

On Tue, Nov 29, 2016 at 4:13 PM, Sean Redmond <sean.redmond1@xxxxxxxxx> wrote:
Normally they mount based upon the gpt label, if it's not working you can mount the disk under /mnt and then cat the file called whoami to find out the osd number 

On 29 Nov 2016 23:56, "Mike Jacobacci" <mikej@xxxxxxxxxx> wrote:
OK I am in some trouble now and would love some help!  After updating none of the OSDs on the node will come back up:
● ceph-disk@dev-sdb1.service                                              loaded failed failed    Ceph disk activation: /dev/sdb1
● ceph-disk@dev-sdb2.service                                              loaded failed failed    Ceph disk activation: /dev/sdb2
● ceph-disk@dev-sdb3.service                                              loaded failed failed    Ceph disk activation: /dev/sdb3
● ceph-disk@dev-sdb4.service                                              loaded failed failed    Ceph disk activation: /dev/sdb4
● ceph-disk@dev-sdb5.service                                              loaded failed failed    Ceph disk activation: /dev/sdb5
● ceph-disk@dev-sdc1.service                                              loaded failed failed    Ceph disk activation: /dev/sdc1
● ceph-disk@dev-sdc2.service                                              loaded failed failed    Ceph disk activation: /dev/sdc2
● ceph-disk@dev-sdc3.service                                              loaded failed failed    Ceph disk activation: /dev/sdc3
● ceph-disk@dev-sdc4.service                                              loaded failed failed    Ceph disk activation: /dev/sdc4
● ceph-disk@dev-sdc5.service                                              loaded failed failed    Ceph disk activation: /dev/sdc5
● ceph-disk@dev-sdd1.service                                              loaded failed failed    Ceph disk activation: /dev/sdd1
● ceph-disk@dev-sde1.service                                              loaded failed failed    Ceph disk activation: /dev/sde1
● ceph-disk@dev-sdf1.service                                              loaded failed failed    Ceph disk activation: /dev/sdf1
● ceph-disk@dev-sdg1.service                                              loaded failed failed    Ceph disk activation: /dev/sdg1
● ceph-disk@dev-sdh1.service                                              loaded failed failed    Ceph disk activation: /dev/sdh1
● ceph-disk@dev-sdi1.service                                              loaded failed failed    Ceph disk activation: /dev/sdi1
● ceph-disk@dev-sdj1.service                                              loaded failed failed    Ceph disk activation: /dev/sdj1
● ceph-disk@dev-sdk1.service                                              loaded failed failed    Ceph disk activation: /dev/sdk1
● ceph-disk@dev-sdl1.service                                              loaded failed failed    Ceph disk activation: /dev/sdl1
● ceph-disk@dev-sdm1.service                                              loaded failed failed    Ceph disk activation: /dev/sdm1
● ceph-osd@0.service                                                      loaded failed failed    Ceph object storage daemon
● ceph-osd@1.service                                                      loaded failed failed    Ceph object storage daemon
● ceph-osd@2.service                                                      loaded failed failed    Ceph object storage daemon
● ceph-osd@3.service                                                      loaded failed failed    Ceph object storage daemon
● ceph-osd@4.service                                                      loaded failed failed    Ceph object storage daemon
● ceph-osd@5.service                                                      loaded failed failed    Ceph object storage daemon
● ceph-osd@6.service                                                      loaded failed failed    Ceph object storage daemon
● ceph-osd@7.service                                                      loaded failed failed    Ceph object storage daemon
● ceph-osd@8.service                                                      loaded failed failed    Ceph object storage daemon
● ceph-osd@9.service                                                      loaded failed failed    Ceph object storage daemon

I did some searching and saw that the issue is that the disks aren't mounting... My question is how can I mount them correctly again (note sdb and sdc are ssd for cache)? I am not sure which disk maps to ceph-osd@0 and so on.  Also, can I add them to /etc/fstab to work around?

Cheers,
Mike

On Tue, Nov 29, 2016 at 10:41 AM, Mike Jacobacci <mikej@xxxxxxxxxx> wrote:
Hello,
I would like to install OS updates on the ceph cluster and activate a second 10gb port on the OSD nodes, so I wanted to verify the correct steps to perform maintenance on the cluster.  We are only using rbd to back our xenserver vm's at this point, and our cluster consists of 3 OSD nodes, 3 Mon nodes and 1 admin node...  So would this be the correct steps:

1. Shut down VM's?
2. run "ceph osd set noout" on admin node
3. install updates on each monitoring node and reboot one at a time.
4. install updates on OSD nodes and activate second 10gb port, reboot one OSD node at a time
5. once all nodes back up, run "ceph osd unset noout"
6. bring VM's back online

Does this sound correct?

Cheers,
Mike

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com