They were not
After I change it manually I was still unable to start the service
Further more, a reboot screed up permissions again
ls -al /dev/sda*
brw-rw---- 1 root disk 8, 0 Jan 3 11:10 /dev/sda
brw-rw---- 1 root disk 8, 1 Jan 3 11:10 /dev/sda1
brw-rw---- 1 root disk 8, 2 Jan 3 11:10 /dev/sda2
[root@osd01 ~]# chown ceph:ceph /dev/sda1
[root@osd01 ~]# chown ceph:ceph /dev/sda2
[root@osd01 ~]# ls -al /dev/sda*
brw-rw---- 1 root disk 8, 0 Jan 3 11:10 /dev/sda
brw-rw---- 1 ceph ceph 8, 1 Jan 3 11:10 /dev/sda1
brw-rw---- 1 ceph ceph 8, 2 Jan 3 11:10 /dev/sda2
[root@osd01 ~]# systemctl start ceph-osd@3
[root@osd01 ~]# systemctl status ceph-osd@3
● ceph-osd@3.service - Ceph object storage daemon osd.3
Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; enabled-runtime; vendor preset: disabled)
Active: activating (auto-restart) (Result: exit-code) since Wed 2018-01-03 11:18:09 EST; 5s ago
Process: 3823 ExecStart=/usr/bin/ceph-osd -f --cluster ${CLUSTER} --id %i --setuser ceph --setgroup ceph (code=exited, status=1/FAILURE)
Process: 3818 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id %i (code=exited, status=0/SUCCESS)
Main PID: 3823 (code=exited, status=1/FAILURE)
Jan 03 11:18:09 osd01.tor.medavail.net systemd[1]: Unit ceph-osd@3.service entered failed state.
Jan 03 11:18:09 osd01.tor.medavail.net systemd[1]: ceph-osd@3.service failed.
ceph-osd[3823]: 2018-01-03 11:18:08.515687 7fa55aec8d00 -1 bluestore(/var/lib/ceph/osd/ceph-3/block.db) _read_bdev_label unable to decode label at offset 102: buffer::malformed_input: void bluesto
ceph-osd[3823]: 2018-01-03 11:18:08.515710 7fa55aec8d00 -1 bluestore(/var/lib/ceph/osd/ceph-3) _open_db check block device(/var/lib/ceph/osd/ceph-3/block.db) label returned: (22) Invalid argument
This is very odd as the server was working fine
What is the proper procedure for replacing a failed SSD drive used by Blustore ?
On 3 January 2018 at 10:23, Sergey Malinin <hell@xxxxxxxxxxx> wrote:
Are actual devices (not only udev links) owned by user “ceph”?
From: ceph-users <ceph-users-bounces@lists.ceph.com > on behalf of Steven Vacaroaia <stef97@xxxxxxxxx>
Sent: Wednesday, January 3, 2018 6:19:45 PM
To: ceph-users
Subject: ceph luminous - SSD partitions disssapearedHi,
After a reboot, all the partitions created on the SSD drive dissapearedThey were used by bluestore DB and WAL so the OSD are down
The following error message are in /var/log/messages
Jan 3 09:54:12 osd01 ceph-osd: 2018-01-03 09:54:12.992218 7f4b52b9ed00 -1 bluestore(/var/lib/ceph/osd/ceph-6) _open_db /var/lib/ceph/osd/ceph-6/ block.db link target doesn't exist Jan 3 09:54:12 osd01 ceph-osd: 2018-01-03 09:54:12.993231 7f7ad37b1d00 -1 bluestore(/var/lib/ceph/osd/ceph-5) _open_db /var/lib/ceph/osd/ceph-5/ block.db link target doesn't exist
Then I decided to take this opportunity and "assume" a dead SSD thiuse recreate partitions
I zapped /dev/sda and thenI used this http://ceph.com/geen-categorie/ceph-recover-osds- to recreate partition for ceph-3after-ssd-journal-failure/ Unfortunatelyy it is now "complaining' about permissions but they seem fine
Jan 3 09:54:12 osd01 ceph-osd: 2018-01-03 09:54:12.992120 7f74003d1d00 -1 bdev(0x562336677800 /var/lib/ceph/osd/ceph-3/block.db) open open got: (13) Permission denied Jan 3 09:54:12 osd01 ceph-osd: 2018-01-03 09:54:12.992131 7f74003d1d00 -1 bluestore(/var/lib/ceph/osd/ceph-3) _open_db add block device(/var/lib/ceph/osd/ceph- 3/block.db) returned: (13) Permission denied
ls -al /var/lib/ceph/osd/ceph-3/total 60drwxr-xr-x 2 ceph ceph 310 Jan 2 16:39 .drwxr-x---. 7 ceph ceph 131 Jan 2 16:39 ..-rw-r--r-- 1 root root 183 Jan 2 16:39 activate.monmap-rw-r--r-- 1 ceph ceph 3 Jan 2 16:39 activelrwxrwxrwx 1 ceph ceph 58 Jan 2 16:32 block -> /dev/disk/by-partuuid/13560618-5942-4c7e-922a- 1fafddb4a4d2 lrwxrwxrwx 1 ceph ceph 58 Jan 2 16:32 block.db -> /dev/disk/by-partuuid/5f610ecb-cb78-44d3-b503- 016840d33ff6 -rw-r--r-- 1 ceph ceph 37 Jan 2 16:32 block.db_uuid-rw-r--r-- 1 ceph ceph 37 Jan 2 16:32 block_uuidlrwxrwxrwx 1 ceph ceph 58 Jan 2 16:32 block.wal -> /dev/disk/by-partuuid/04d38ce7-c9e7-4648-a3f5- 7b459e508109
Anyone had to deal with a similar issue ?
How d I fix the permission ?
What is the proper procedure for dealing with a "dead' SSD ?
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com