OSDs failed to start after host reboot | Cephadm

Ben Meinhart <ben@xxxxxxxxxxx> · Thu, 12 Jan 2023 14:10:41 -0600

Hello all!

Linked stackoverflow post: https://stackoverflow.com/questions/75101087/cephadm-ceph-osd-fails-to-start-after-reboot-of-host <https://stackoverflow.com/questions/75101087/cephadm-ceph-osd-fails-to-start-after-reboot-of-host>

A couple of weeks ago I deployed a new Ceph cluster using Cephadm. It is a three node cluster (node1, node2, & node3) with 6 OSD’s each; 6x18TB Seagate hard drives with a 2TB NVMe drive set as a DB device. Everything has been running smoothly until today when I went to perform maintenance on one of the nodes. I first moved all of the services off the host and put it into maintenance mode. I then made some changes to once of the NIC’s and  ran updates. After the updates were done, I rebooted the machine. This is when the issue occurred.

When the node (node1) finished rebooting, it was still showing as offline in the Ceph Dashboard so from one of the host I ran `ceph orch host rescan node1` and it came back online in the Ceph dashboard. I’ve seen this before when I’ve had to reboot host so NBD so far.

However, after a couple of minutes passed the OSD’s on that host still haven’t come online. I then checked the status of the services `systemctl | grep ceph` and saw that all of the OSD’s had failed. 
# systemctl status ceph-0a7ec2ae-816d-11ed-9791-97c1d8fb9dc6@osd.0.service
× ceph-0a7ec2ae-816d-11ed-9791-97c1d8fb9dc6@osd.0.service - Ceph osd.0 for 0a7ec2ae-816d-11ed-9791-97c1d8fb9dc6
     Loaded: loaded (/etc/systemd/system/ceph-0a7ec2ae-816d-11ed-9791-97c1d8fb9dc6@.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Thu 2023-01-12 18:14:27 UTC; 1h 42min ago
   Main PID: 385982 (code=exited, status=1/FAILURE)
        CPU: 292ms

Jan 12 19:48:30 node1 systemd[1]: /etc/systemd/system/ceph-0a7ec2ae-816d-11ed-9791-97c1d8fb9dc6@.service:24: Unit configured to use KillMode=none. This is unsafe, as it disables systemd's process lifecycle management for the service. Please update your service to use a safer Kill

It was at the reset counter max so I had to run `systemctl reset-failed` and I tried restarting the OSD’s by running `systemctl restart ceph.target`.  I watched the service try to load but it kept failing. 

This was the output of /var/log/ceph/<fsid>/ceph-osd.0.log:
2023-01-12T18:12:06.501+0000 7fb5d3b1e3c0  0 set uid:gid to 167:167 (ceph:ceph)
2023-01-12T18:12:06.501+0000 7fb5d3b1e3c0  0 ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable), process ceph-osd, pid 7
2023-01-12T18:12:06.501+0000 7fb5d3b1e3c0  0 pidfile_write: ignore empty --pid-file
2023-01-12T18:12:06.505+0000 7fb5d3b1e3c0  1 bdev(0x5591e1f87400 /var/lib/ceph/osd/ceph-0/block) open path /var/lib/ceph/osd/ceph-0/block
2023-01-12T18:12:06.505+0000 7fb5d3b1e3c0  1 bdev(0x5591e1f87400 /var/lib/ceph/osd/ceph-0/block) open size 20000584761344 (0x1230bfc00000, 18 TiB) block_size 4096 (4 KiB) rotational discard not supported
2023-01-12T18:12:06.505+0000 7fb5d3b1e3c0  1 bluestore(/var/lib/ceph/osd/ceph-0) _set_cache_sizes cache_size 1073741824 meta 0.45 kv 0.45 data 0.06
2023-01-12T18:12:06.505+0000 7fb5d3b1e3c0  1 bdev(0x5591e1f86c00 /var/lib/ceph/osd/ceph-0/block.db) open path /var/lib/ceph/osd/ceph-0/block.db
2023-01-12T18:12:06.505+0000 7fb5d3b1e3c0  1 bdev(0x5591e1f86c00 /var/lib/ceph/osd/ceph-0/block.db) open size 333396836352 (0x4da0000000, 310 GiB) block_size 4096 (4 KiB) non-rotational discard supported
2023-01-12T18:12:06.505+0000 7fb5d3b1e3c0  1 bluefs add_block_device bdev 1 path /var/lib/ceph/osd/ceph-0/block.db size 310 GiB
2023-01-12T18:12:06.513+0000 7fb5d3b1e3c0  1 bdev(0x5591e1f86800 /var/lib/ceph/osd/ceph-0/block) open path /var/lib/ceph/osd/ceph-0/block
2023-01-12T18:12:06.513+0000 7fb5d3b1e3c0  1 bdev(0x5591e1f86800 /var/lib/ceph/osd/ceph-0/block) open size 20000584761344 (0x1230bfc00000, 18 TiB) block_size 4096 (4 KiB) rotational discard not supported
2023-01-12T18:12:06.513+0000 7fb5d3b1e3c0  1 bluefs add_block_device bdev 2 path /var/lib/ceph/osd/ceph-0/block size 18 TiB
2023-01-12T18:12:06.513+0000 7fb5d3b1e3c0  1 bdev(0x5591e1f86c00 /var/lib/ceph/osd/ceph-0/block.db) close
2023-01-12T18:12:06.817+0000 7fb5d3b1e3c0  1 bdev(0x5591e1f86800 /var/lib/ceph/osd/ceph-0/block) close
2023-01-12T18:12:07.085+0000 7fb5d3b1e3c0  1 bdev(0x5591e1f87400 /var/lib/ceph/osd/ceph-0/block) close
2023-01-12T18:12:07.305+0000 7fb5d3b1e3c0  0 starting osd.0 osd_data /var/lib/ceph/osd/ceph-0 /var/lib/ceph/osd/ceph-0/journal
2023-01-12T18:12:07.321+0000 7fb5d3b1e3c0  0 load: jerasure load: lrc 
2023-01-12T18:12:07.321+0000 7fb5d3b1e3c0  1 bdev(0x5591e2d8e000 /var/lib/ceph/osd/ceph-0/block) open path /var/lib/ceph/osd/ceph-0/block
2023-01-12T18:12:07.321+0000 7fb5d3b1e3c0 -1 bdev(0x5591e2d8e000 /var/lib/ceph/osd/ceph-0/block) open open got: (13) Permission denied
2023-01-12T18:12:07.321+0000 7fb5d3b1e3c0  1 bdev(0x5591e2d8e000 /var/lib/ceph/osd/ceph-0/block) open path /var/lib/ceph/osd/ceph-0/block
2023-01-12T18:12:07.321+0000 7fb5d3b1e3c0 -1 bdev(0x5591e2d8e000 /var/lib/ceph/osd/ceph-0/block) open open got: (13) Permission denied
2023-01-12T18:12:07.321+0000 7fb5d3b1e3c0  1 mClockScheduler: set_max_osd_capacity #op shards: 5 max osd capacity(iops) per shard: 863.20
2023-01-12T18:12:07.321+0000 7fb5d3b1e3c0  1 mClockScheduler: set_osd_mclock_cost_per_io osd_mclock_cost_per_io: 0.0250000
2023-01-12T18:12:07.321+0000 7fb5d3b1e3c0  1 mClockScheduler: set_osd_mclock_cost_per_byte osd_mclock_cost_per_byte: 0.0000052
2023-01-12T18:12:07.321+0000 7fb5d3b1e3c0  1 mClockScheduler: set_mclock_profile mclock profile: high_client_ops
2023-01-12T18:12:07.321+0000 7fb5d3b1e3c0  0 osd.0:0.OSDShard using op scheduler mClockScheduler
2023-01-12T18:12:07.321+0000 7fb5d3b1e3c0  1 bdev(0x5591e2d8e000 /var/lib/ceph/osd/ceph-0/block) open path /var/lib/ceph/osd/ceph-0/block
2023-01-12T18:12:07.321+0000 7fb5d3b1e3c0 -1 bdev(0x5591e2d8e000 /var/lib/ceph/osd/ceph-0/block) open open got: (13) Permission denied
2023-01-12T18:12:07.321+0000 7fb5d3b1e3c0  1 mClockScheduler: set_max_osd_capacity #op shards: 5 max osd capacity(iops) per shard: 863.20
2023-01-12T18:12:07.321+0000 7fb5d3b1e3c0  1 mClockScheduler: set_osd_mclock_cost_per_io osd_mclock_cost_per_io: 0.0250000
2023-01-12T18:12:07.321+0000 7fb5d3b1e3c0  1 mClockScheduler: set_osd_mclock_cost_per_byte osd_mclock_cost_per_byte: 0.0000052
2023-01-12T18:12:07.321+0000 7fb5d3b1e3c0  1 mClockScheduler: set_mclock_profile mclock profile: high_client_ops
2023-01-12T18:12:07.321+0000 7fb5d3b1e3c0  0 osd.0:1.OSDShard using op scheduler mClockScheduler
2023-01-12T18:12:07.321+0000 7fb5d3b1e3c0  1 bdev(0x5591e2d8e000 /var/lib/ceph/osd/ceph-0/block) open path /var/lib/ceph/osd/ceph-0/block
2023-01-12T18:12:07.321+0000 7fb5d3b1e3c0 -1 bdev(0x5591e2d8e000 /var/lib/ceph/osd/ceph-0/block) open open got: (13) Permission denied
2023-01-12T18:12:07.325+0000 7fb5d3b1e3c0  1 mClockScheduler: set_max_osd_capacity #op shards: 5 max osd capacity(iops) per shard: 863.20
2023-01-12T18:12:07.325+0000 7fb5d3b1e3c0  1 mClockScheduler: set_osd_mclock_cost_per_io osd_mclock_cost_per_io: 0.0250000
2023-01-12T18:12:07.325+0000 7fb5d3b1e3c0  1 mClockScheduler: set_osd_mclock_cost_per_byte osd_mclock_cost_per_byte: 0.0000052
2023-01-12T18:12:07.325+0000 7fb5d3b1e3c0  1 mClockScheduler: set_mclock_profile mclock profile: high_client_ops
2023-01-12T18:12:07.325+0000 7fb5d3b1e3c0  0 osd.0:2.OSDShard using op scheduler mClockScheduler
2023-01-12T18:12:07.325+0000 7fb5d3b1e3c0  1 bdev(0x5591e2d8e000 /var/lib/ceph/osd/ceph-0/block) open path /var/lib/ceph/osd/ceph-0/block
2023-01-12T18:12:07.325+0000 7fb5d3b1e3c0 -1 bdev(0x5591e2d8e000 /var/lib/ceph/osd/ceph-0/block) open open got: (13) Permission denied
2023-01-12T18:12:07.325+0000 7fb5d3b1e3c0  1 mClockScheduler: set_max_osd_capacity #op shards: 5 max osd capacity(iops) per shard: 863.20
2023-01-12T18:12:07.325+0000 7fb5d3b1e3c0  1 mClockScheduler: set_osd_mclock_cost_per_io osd_mclock_cost_per_io: 0.0250000
2023-01-12T18:12:07.325+0000 7fb5d3b1e3c0  1 mClockScheduler: set_osd_mclock_cost_per_byte osd_mclock_cost_per_byte: 0.0000052
2023-01-12T18:12:07.325+0000 7fb5d3b1e3c0  1 mClockScheduler: set_mclock_profile mclock profile: high_client_ops
2023-01-12T18:12:07.325+0000 7fb5d3b1e3c0  0 osd.0:3.OSDShard using op scheduler mClockScheduler
2023-01-12T18:12:07.325+0000 7fb5d3b1e3c0  1 bdev(0x5591e2d8e000 /var/lib/ceph/osd/ceph-0/block) open path /var/lib/ceph/osd/ceph-0/block
2023-01-12T18:12:07.325+0000 7fb5d3b1e3c0 -1 bdev(0x5591e2d8e000 /var/lib/ceph/osd/ceph-0/block) open open got: (13) Permission denied
2023-01-12T18:12:07.325+0000 7fb5d3b1e3c0  1 mClockScheduler: set_max_osd_capacity #op shards: 5 max osd capacity(iops) per shard: 863.20
2023-01-12T18:12:07.325+0000 7fb5d3b1e3c0  1 mClockScheduler: set_osd_mclock_cost_per_io osd_mclock_cost_per_io: 0.0250000
2023-01-12T18:12:07.325+0000 7fb5d3b1e3c0  1 mClockScheduler: set_osd_mclock_cost_per_byte osd_mclock_cost_per_byte: 0.0000052
2023-01-12T18:12:07.325+0000 7fb5d3b1e3c0  1 mClockScheduler: set_mclock_profile mclock profile: high_client_ops
2023-01-12T18:12:07.325+0000 7fb5d3b1e3c0  0 osd.0:4.OSDShard using op scheduler mClockScheduler
2023-01-12T18:12:07.325+0000 7fb5d3b1e3c0 -1 bluestore(/var/lib/ceph/osd/ceph-0/block) _read_bdev_label failed to open /var/lib/ceph/osd/ceph-0/block: (13) Permission denied
2023-01-12T18:12:07.325+0000 7fb5d3b1e3c0 -1 bluestore(/var/lib/ceph/osd/ceph-0/block) _read_bdev_label failed to open /var/lib/ceph/osd/ceph-0/block: (13) Permission denied
2023-01-12T18:12:07.325+0000 7fb5d3b1e3c0  1 bdev(0x5591e2d8e000 /var/lib/ceph/osd/ceph-0/block) open path /var/lib/ceph/osd/ceph-0/block
2023-01-12T18:12:07.325+0000 7fb5d3b1e3c0 -1 bdev(0x5591e2d8e000 /var/lib/ceph/osd/ceph-0/block) open open got: (13) Permission denied
2023-01-12T18:12:07.325+0000 7fb5d3b1e3c0 -1 osd.0 0 OSD:init: unable to mount object store
2023-01-12T18:12:07.325+0000 7fb5d3b1e3c0 -1 [0;31m ** ERROR: osd init failed: (13) Permission denied[0m

Judging by the final error, it looked like some sort of permissions issue with mounting the volume to the container. I did notice on the other two host, node2 & node3, that I have not yet reboot since deploying Ceph with cephadm that it was more docker overlays mounted when I ran the `mount` command. My theory is that the LVM volume stored on the OSD’s is not being mounted at boot. Otherwise it might also be the case that the user that Ceph is passing to the containers is not allowed to mount the volumes for some reason. 

I’ve looked through most of the docs and forums I could find and haven’t found any solutions. I would like to say I’m fairly experienced with Linux 5+ years, but I am new to Ceph (~6 months) and I haven’t emailed this list before. Sorry in advance if I’ve mistakenly broken any roles and thanks for the help!

- Ben M
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx