Re: Replace ceph osd in a container

Sasha Litvak <alexander.v.litvak@xxxxxxxxx> · Tue, 22 Oct 2019 08:52:54 -0500

Frank,
Thank you for your suggestion.  It sounds very promising.  I will definitely try it.

Best,

On Tue, Oct 22, 2019, 2:44 AM Frank Schilder <frans@xxxxxx> wrote:
> I am suspecting that mon or mgr have no access to /dev or /var/lib while osd containers do. 

> Cluster configured originally by ceph-ansible (nautilus 14.2.2)

They don't, because they don't need to.

> The question is if I want to replace all disks on a single node, and I have 6 nodes with pools

> replication 3, is it safe to restart mgr mounting /dev and /var/lib/ceph volumes (not configured right now).

Restarting mons is safe in the sense that data will not get lost. However, access might get lost temporarily.

The question is, how many mons do you have? If you have only 1 or 2, it will mean downtime. If you can bear the downtime, it doesn't matter. If you have at least 3, you can restart one after the other.

However, I would not do that. Having to restart a mon container every time some minor container config changes for reasons that have nothing to do with a mon sounds like calling for trouble.

I also use containers and would recommend a different approach. I created an additional type of container (ceph-adm) that I use for all admin tasks. Its the same image and the entry point simply executes a sleep infinity. In this container I make all relevant hardware visible. You might also want to expose /var/run/ceph to be able to use admin sockets without hassle. This way, I separated admin operations from actual storage daemons and can modify and restart the admin container as I like.

Best regards,

=================

Frank Schilder

AIT Risø Campus

Bygning 109, rum S14

________________________________________

From: ceph-users <ceph-users-bounces@xxxxxxxxxxxxxx> on behalf of Alex Litvak <alexander.v.litvak@xxxxxxxxx>

Sent: 22 October 2019 08:04

To: ceph-users@xxxxxxxxxxxxxx

Subject:  Replace ceph osd in a container

Hello cephers,

So I am having trouble with a new hardware systems with strange OSD behavior and I want to replace a disk with a brand new one to test the theory.

I run all daemons in containers and on one of the nodes I have mon, mgr, and 6 osds.  So following https://docs.ceph.com/docs/master/rados/operations/add-or-rm-osds/#replacing-an-osd

I stopped container with osd.23, waited until it is down and out, ran safe-to-destroy loop and then destroyed the osd all using the monitor from the container on this node.  All good.

Then I swapped the SSDs and started running additional steps (from step 3) using the same mon container.  I have no ceph packages installed on the bare metal box. It looks like mon container doesn't

see the disk.

     podman exec -it ceph-mon-storage2n2-la ceph-volume lvm zap /dev/sdh

  stderr: lsblk: /dev/sdh: not a block device

  stderr: error: /dev/sdh: No such file or directory

  stderr: Unknown device, --name=, --path=, or absolute path in /dev/ or /sys expected.

usage: ceph-volume lvm zap [-h] [--destroy] [--osd-id OSD_ID]

                            [--osd-fsid OSD_FSID]

                            [DEVICES [DEVICES ...]]

ceph-volume lvm zap: error: Unable to proceed with non-existing device: /dev/sdh

Error: exit status 2

root@storage2n2-la:~# ls -l /dev/sd

sda   sdc   sdd   sde   sdf   sdg   sdg1  sdg2  sdg5  sdh

root@storage2n2-la:~# podman exec -it ceph-mon-storage2n2-la ceph-volume lvm zap sdh

  stderr: lsblk: sdh: not a block device

  stderr: error: sdh: No such file or directory

  stderr: Unknown device, --name=, --path=, or absolute path in /dev/ or /sys expected.

usage: ceph-volume lvm zap [-h] [--destroy] [--osd-id OSD_ID]

                            [--osd-fsid OSD_FSID]

                            [DEVICES [DEVICES ...]]

ceph-volume lvm zap: error: Unable to proceed with non-existing device: sdh

Error: exit status 2

I execute lsblk and it sees device sdh

root@storage2n2-la:~# podman exec -it ceph-mon-storage2n2-la lsblk

lsblk: dm-1: failed to get device path

lsblk: dm-2: failed to get device path

lsblk: dm-4: failed to get device path

lsblk: dm-6: failed to get device path

lsblk: dm-4: failed to get device path

lsblk: dm-2: failed to get device path

lsblk: dm-1: failed to get device path

lsblk: dm-0: failed to get device path

lsblk: dm-0: failed to get device path

lsblk: dm-7: failed to get device path

lsblk: dm-5: failed to get device path

lsblk: dm-7: failed to get device path

lsblk: dm-6: failed to get device path

lsblk: dm-5: failed to get device path

lsblk: dm-3: failed to get device path

NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT

sdf      8:80   0   1.8T  0 disk

sdd      8:48   0   1.8T  0 disk

sdg      8:96   0 223.5G  0 disk

|-sdg5   8:101  0   223G  0 part

|-sdg1   8:97       487M  0 part

`-sdg2   8:98         1K  0 part

sde      8:64   0   1.8T  0 disk

sdc      8:32   0   3.5T  0 disk

sda      8:0    0   3.5T  0 disk

sdh      8:112  0   3.5T  0 disk

So I use a fellow osd container (osd.5) on the same node and run all of the operations (zap and prepare) successfully.

I am suspecting that mon or mgr have no access to /dev or /var/lib while osd containers do.  Cluster configured originally by ceph-ansible (nautilus 14.2.2)

The question is if I want to replace all disks on a single node, and I have 6 nodes with pools replication 3, is it safe to restart mgr mounting /dev and /var/lib/ceph volumes (not configured right now).

I cannot use other osd containers on the same box because my controller reverts from raid to non-raid mode with all disks lost and not just a single one.  So I need to replace all 6 osds to run back

in containers and the only things will remain operational on node are mon and mgr containers.

I prefer not to install a full cluster or client on the bare metal node if possible.

Thank you for your help,

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx