Seen this issue when I first created our Luminous cluster. I use a custom systemd service to chown the DB and WAL partitions before ceph osd services get started. The script in /usr/local/sbin just does the chowning.
# This is a workaround to chown the rocksdb and wal partitions
# for ceph-osd on nvme, because ceph-disk currently does not
# chown them to ceph:ceph so OSDs can't come up at OS startup
[Unit]
Description=Chown rocksdb and wal partitions on NVMe workaround
ConditionFileIsExecutable=/usr/local/sbin/ceph-nvme.sh
After=local-fs.target
[Service]
Type=oneshot
ExecStart=/usr/local/sbin/ceph-nvme.sh
TimeoutSec=0
[Install]
WantedBy=multi-user.target
From: ceph-users <ceph-users-bounces@xxxxxxxxxxxxxx> on behalf of Sergey Malinin <hell@xxxxxxxxxxx>
Sent: Thursday, 4 January 2018 10:56:13 AM To: Steven Vacaroaia Cc: ceph-users Subject: Re: ceph luminous - SSD partitions disssapeared To make device ownership persist over reboots, you can to set up udev rules.
The article you referenced seems to have nothing to do with bluestore. When you had zapped /dev/sda, you zapped bluestore metadata stored on db partition so newly created partitions, if they were created apart from block storage,
are no longer relevant and that’s why osd daemon throws error.
From: Steven Vacaroaia <stef97@xxxxxxxxx>
Sent: Wednesday, January 3, 2018 7:20:12 PM To: Sergey Malinin Cc: ceph-users Subject: Re: ceph luminous - SSD partitions disssapeared They were not
After I change it manually I was still unable to start the service
Further more, a reboot screed up permissions again
ls -al /dev/sda*
brw-rw---- 1 root disk 8, 0 Jan 3 11:10 /dev/sda
brw-rw---- 1 root disk 8, 1 Jan 3 11:10 /dev/sda1
brw-rw---- 1 root disk 8, 2 Jan 3 11:10 /dev/sda2
[root@osd01 ~]# chown ceph:ceph /dev/sda1
[root@osd01 ~]# chown ceph:ceph /dev/sda2
[root@osd01 ~]# ls -al /dev/sda*
brw-rw---- 1 root disk 8, 0 Jan 3 11:10 /dev/sda
brw-rw---- 1 ceph ceph 8, 1 Jan 3 11:10 /dev/sda1
brw-rw---- 1 ceph ceph 8, 2 Jan 3 11:10 /dev/sda2
[root@osd01 ~]# systemctl start ceph-osd@3
[root@osd01 ~]# systemctl status ceph-osd@3
● ceph-osd@3.service - Ceph object storage daemon osd.3
Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; enabled-runtime; vendor preset: disabled)
Active: activating (auto-restart) (Result: exit-code) since Wed 2018-01-03 11:18:09 EST; 5s ago
Process: 3823 ExecStart=/usr/bin/ceph-osd -f --cluster ${CLUSTER} --id %i --setuser ceph --setgroup ceph (code=exited, status=1/FAILURE)
Process: 3818 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id %i (code=exited, status=0/SUCCESS)
Main PID: 3823 (code=exited, status=1/FAILURE)
Jan 03 11:18:09
osd01.tor.medavail.net systemd[1]: Unit ceph-osd@3.service entered failed state.
Jan 03 11:18:09
osd01.tor.medavail.net systemd[1]: ceph-osd@3.service failed.
ceph-osd[3823]: 2018-01-03 11:18:08.515687 7fa55aec8d00 -1 bluestore(/var/lib/ceph/osd/ceph-3/block.db) _read_bdev_label unable to decode label at offset 102: buffer::malformed_input: void bluesto
ceph-osd[3823]: 2018-01-03 11:18:08.515710 7fa55aec8d00 -1 bluestore(/var/lib/ceph/osd/ceph-3) _open_db check block device(/var/lib/ceph/osd/ceph-3/block.db) label returned: (22) Invalid argument
This is very odd as the server was working fine
What is the proper procedure for replacing a failed SSD drive used by Blustore ?
On 3 January 2018 at 10:23, Sergey Malinin
<hell@xxxxxxxxxxx> wrote:
|
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com