Bluestore created with 12.2.10/luminous.
The OSD startup generates logs like:
2019-07-24 12:39:46.483 7f4b27649d80 0 set uid:gid to 167:167 (ceph:ceph)
2019-07-24 12:39:46.483 7f4b27649d80 0 ceph version 14.2.2 (4f8fa0a0024755aae7d95567c63f11d6862d55be) nautilus (stable), process ceph-osd, pid 48553
2019-07-24 12:39:46.483 7f4b27649d80 0 pidfile_write: ignore empty --pid-file
2019-07-24 12:39:46.483 7f4b27649d80 0 set uid:gid to 167:167 (ceph:ceph)
2019-07-24 12:39:46.483 7f4b27649d80 0 ceph version 14.2.2 (4f8fa0a0024755aae7d95567c63f11d6862d55be) nautilus (stable), process ceph-osd, pid 48553
2019-07-24 12:39:46.483 7f4b27649d80 0 pidfile_write: ignore empty --pid-file
2019-07-24 12:39:46.505 7f4b27649d80 -1 bluestore(/var/lib/ceph/osd/ceph-0/block) _read_bdev_label failed to open /var/lib/ceph/osd/ceph-0/block: (2) No such file or directory
2019-07-24 12:39:46.505 7f4b27649d80 -1 ** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/ceph-0: (2) No such file or directory
2019-07-24 12:39:46.505 7f4b27649d80 -1 bluestore(/var/lib/ceph/osd/ceph-0/block) _read_bdev_label failed to open /var/lib/ceph/osd/ceph-0/block: (2) No such file or directory
2019-07-24 12:39:46.505 7f4b27649d80 -1 bluestore(/var/lib/ceph/osd/ceph-0/block) _read_bdev_label failed to open /var/lib/ceph/osd/ceph-0/block: (2) No such file or directory
2019-07-24 12:39:46.505 7f4b27649d80 -1 ** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/ceph-0: (2) No such file or directory
2019-07-24 12:39:46.505 7f4b27649d80 -1 ** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/ceph-0: (2) No such file or directory
-----
# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 1.7T 0 disk
├─sda1 8:1 0 100M 0 part
├─sda2 8:2 0 1.7T 0 part
└─sda5 8:5 0 10M 0 part
sdb 8:16 0 1.7T 0 disk
├─sdb1 8:17 0 100M 0 part
├─sdb2 8:18 0 1.7T 0 part
└─sdb5 8:21 0 10M 0 part
sdc 8:32 0 1.7T 0 disk
├─sdc1 8:33 0 100M 0 part
...
I'm thinking the OSD would start (I can recreate the .service definitions in systemctl) if the above were mounted in a way like they are on another of my hosts:
# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 1.7T 0 disk
├─sda1 8:1 0 100M 0 part
│ └─97712be4-1234-4acc-8102-2265769053a5 253:17 0 98M 0 crypt /var/lib/ceph/osd/ceph-16
├─sda2 8:2 0 1.7T 0 part
│ └─049b7160-1234-4edd-a5dc-fe00faca8d89 253:16 0 1.7T 0 crypt
└─sda5 8:5 0 10M 0 part /var/lib/ceph/osd-lockbox/97712be4-9674-4acc-1234-2265769053a5
sdb 8:16 0 1.7T 0 disk
├─sdb1 8:17 0 100M 0 part
│ └─f03f0298-1234-42e9-8b28-f3016e44d1e2 253:26 0 98M 0 crypt /var/lib/ceph/osd/ceph-17
├─sdb2 8:18 0 1.7T 0 part
│ └─51177019-1234-4963-82d1-5006233f5ab2 253:30 0 1.7T 0 crypt
└─sdb5 8:21 0 10M 0 part /var/lib/ceph/osd-lockbox/f03f0298-1234-42e9-8b28-f3016e44d1e2
sdc 8:32 0 1.7T 0 disk
├─sdc1 8:33 0 100M 0 part
│ └─0184df0c-1234-404d-92de-cb71b1047abf 253:8 0 98M 0 crypt /var/lib/ceph/osd/ceph-18
├─sdc2 8:34 0 1.7T 0 part
│ └─fdad7618-1234-4021-a63e-40d973712e7b 253:13 0 1.7T 0 crypt
...
Thank you for your time on this,
peter
From: Xavier Trilla <xavier.trilla@xxxxxxxxxxx>
Date: Wednesday, July 24, 2019 at 1:25 PM
To: Peter Eisch <peter.eisch@xxxxxxxxxxxxxxx>
Cc: "ceph-users@xxxxxxxxxxxxxx" <ceph-users@xxxxxxxxxxxxxx>
Subject: Re: Upgrading and lost OSDs
Hi Peter,
Im not sure but maybe after some changes the OSDs are not being recongnized by ceph scripts.
Ceph used to use udev to detect the OSDs and then moved to lvm, which kind of OSDs are you running? Blustore or filestore? Which version did you use to create them?
Cheers!
El 24 jul 2019, a les 20:04, Peter Eisch <mailto:peter.eisch@xxxxxxxxxxxxxxx> va escriure:
Hi,
I’m working through updating from 12.2.12/luminious to 14.2.2/nautilus on centos 7.6. The managers are updated alright:
# ceph -s
cluster:
id: 2fdb5976-1234-4b29-ad9c-1ca74a9466ec
health: HEALTH_WARN
Degraded data redundancy: 24177/9555955 objects degraded (0.253%), 7 pgs degraded, 1285 pgs undersized
3 monitors have not enabled msgr2
...
I updated ceph on a OSD host with 'yum update' and then rebooted to grab the current kernel. Along the way, the contents of all the directories in /var/lib/ceph/osd/ceph-*/ were deleted. Thus I have 16 OSDs down from this. I can manage the undersized but I'd like to get these drives working again without deleting each OSD and recreating them.
So far I've pulled the respective cephx key into the 'keyring' file and populated 'bluestore' into the 'type' files but I'm unsure how to get the lockboxes mounted to where I can get the OSDs running. The osd-lockbox directory is otherwise untouched from when the OSDs were deployed.
Is there a way to run ceph-deploy or some other tool to rebuild the mounts for the drives?
peter
Peter Eisch
Senior Site Reliability Engineer
T
tel:1.612.659.3228
_______________________________________________
ceph-users mailing list
mailto:ceph-users@xxxxxxxxxxxxxx
https://nam02.safelinks.protection.outlook.com/?url="" />
The OSD startup generates logs like:
2019-07-24 12:39:46.483 7f4b27649d80 0 set uid:gid to 167:167 (ceph:ceph)
2019-07-24 12:39:46.483 7f4b27649d80 0 ceph version 14.2.2 (4f8fa0a0024755aae7d95567c63f11d6862d55be) nautilus (stable), process ceph-osd, pid 48553
2019-07-24 12:39:46.483 7f4b27649d80 0 pidfile_write: ignore empty --pid-file
2019-07-24 12:39:46.483 7f4b27649d80 0 set uid:gid to 167:167 (ceph:ceph)
2019-07-24 12:39:46.483 7f4b27649d80 0 ceph version 14.2.2 (4f8fa0a0024755aae7d95567c63f11d6862d55be) nautilus (stable), process ceph-osd, pid 48553
2019-07-24 12:39:46.483 7f4b27649d80 0 pidfile_write: ignore empty --pid-file
2019-07-24 12:39:46.505 7f4b27649d80 -1 bluestore(/var/lib/ceph/osd/ceph-0/block) _read_bdev_label failed to open /var/lib/ceph/osd/ceph-0/block: (2) No such file or directory
2019-07-24 12:39:46.505 7f4b27649d80 -1 ** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/ceph-0: (2) No such file or directory
2019-07-24 12:39:46.505 7f4b27649d80 -1 bluestore(/var/lib/ceph/osd/ceph-0/block) _read_bdev_label failed to open /var/lib/ceph/osd/ceph-0/block: (2) No such file or directory
2019-07-24 12:39:46.505 7f4b27649d80 -1 bluestore(/var/lib/ceph/osd/ceph-0/block) _read_bdev_label failed to open /var/lib/ceph/osd/ceph-0/block: (2) No such file or directory
2019-07-24 12:39:46.505 7f4b27649d80 -1 ** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/ceph-0: (2) No such file or directory
2019-07-24 12:39:46.505 7f4b27649d80 -1 ** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/ceph-0: (2) No such file or directory
-----
# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 1.7T 0 disk
├─sda1 8:1 0 100M 0 part
├─sda2 8:2 0 1.7T 0 part
└─sda5 8:5 0 10M 0 part
sdb 8:16 0 1.7T 0 disk
├─sdb1 8:17 0 100M 0 part
├─sdb2 8:18 0 1.7T 0 part
└─sdb5 8:21 0 10M 0 part
sdc 8:32 0 1.7T 0 disk
├─sdc1 8:33 0 100M 0 part
...
I'm thinking the OSD would start (I can recreate the .service definitions in systemctl) if the above were mounted in a way like they are on another of my hosts:
# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 1.7T 0 disk
├─sda1 8:1 0 100M 0 part
│ └─97712be4-1234-4acc-8102-2265769053a5 253:17 0 98M 0 crypt /var/lib/ceph/osd/ceph-16
├─sda2 8:2 0 1.7T 0 part
│ └─049b7160-1234-4edd-a5dc-fe00faca8d89 253:16 0 1.7T 0 crypt
└─sda5 8:5 0 10M 0 part /var/lib/ceph/osd-lockbox/97712be4-9674-4acc-1234-2265769053a5
sdb 8:16 0 1.7T 0 disk
├─sdb1 8:17 0 100M 0 part
│ └─f03f0298-1234-42e9-8b28-f3016e44d1e2 253:26 0 98M 0 crypt /var/lib/ceph/osd/ceph-17
├─sdb2 8:18 0 1.7T 0 part
│ └─51177019-1234-4963-82d1-5006233f5ab2 253:30 0 1.7T 0 crypt
└─sdb5 8:21 0 10M 0 part /var/lib/ceph/osd-lockbox/f03f0298-1234-42e9-8b28-f3016e44d1e2
sdc 8:32 0 1.7T 0 disk
├─sdc1 8:33 0 100M 0 part
│ └─0184df0c-1234-404d-92de-cb71b1047abf 253:8 0 98M 0 crypt /var/lib/ceph/osd/ceph-18
├─sdc2 8:34 0 1.7T 0 part
│ └─fdad7618-1234-4021-a63e-40d973712e7b 253:13 0 1.7T 0 crypt
...
Thank you for your time on this,
peter
| |||||||
| |||||||
| |||||||
| |||||||
| |||||||
|
From: Xavier Trilla <xavier.trilla@xxxxxxxxxxx>
Date: Wednesday, July 24, 2019 at 1:25 PM
To: Peter Eisch <peter.eisch@xxxxxxxxxxxxxxx>
Cc: "ceph-users@xxxxxxxxxxxxxx" <ceph-users@xxxxxxxxxxxxxx>
Subject: Re: Upgrading and lost OSDs
Hi Peter,
Im not sure but maybe after some changes the OSDs are not being recongnized by ceph scripts.
Ceph used to use udev to detect the OSDs and then moved to lvm, which kind of OSDs are you running? Blustore or filestore? Which version did you use to create them?
Cheers!
El 24 jul 2019, a les 20:04, Peter Eisch <mailto:peter.eisch@xxxxxxxxxxxxxxx> va escriure:
Hi,
I’m working through updating from 12.2.12/luminious to 14.2.2/nautilus on centos 7.6. The managers are updated alright:
# ceph -s
cluster:
id: 2fdb5976-1234-4b29-ad9c-1ca74a9466ec
health: HEALTH_WARN
Degraded data redundancy: 24177/9555955 objects degraded (0.253%), 7 pgs degraded, 1285 pgs undersized
3 monitors have not enabled msgr2
...
I updated ceph on a OSD host with 'yum update' and then rebooted to grab the current kernel. Along the way, the contents of all the directories in /var/lib/ceph/osd/ceph-*/ were deleted. Thus I have 16 OSDs down from this. I can manage the undersized but I'd like to get these drives working again without deleting each OSD and recreating them.
So far I've pulled the respective cephx key into the 'keyring' file and populated 'bluestore' into the 'type' files but I'm unsure how to get the lockboxes mounted to where I can get the OSDs running. The osd-lockbox directory is otherwise untouched from when the OSDs were deployed.
Is there a way to run ceph-deploy or some other tool to rebuild the mounts for the drives?
peter
Peter Eisch
Senior Site Reliability Engineer
T
tel:1.612.659.3228
_______________________________________________
ceph-users mailing list
mailto:ceph-users@xxxxxxxxxxxxxx
https://nam02.safelinks.protection.outlook.com/?url="" />
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com