Re: OSD id 241 != my id 248: conversion from "ceph-disk" to "ceph-volume simple" destroys OSDs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Chris,

I found the problem. "ceph-volume simple activate" modifies the OSD's meta data in an invalid way.

On a pre lvm-converted ceph-disk OSD I had in my cupboard:

[root@ceph-adm:ceph-20 ~]# mount /dev/sdq1 mnt
[root@ceph-adm:ceph-20 ~]# ls -l mnt
[...]
lrwxrwxrwx. 1 ceph ceph  58 Mar 15  2019 block -> /dev/disk/by-partuuid/a1e5ef7d-9bab-4911-abe5-9075b91d88a4
[..]
[root@ceph-adm:ceph-20 ~]# umount mnt

[root@ceph-adm:ceph-20 ~]# cat /etc/ceph/osd/59-9b88d6ec-87a4-4640-b80e-81d3d56fac15.json
{
    "active": "ok",
    "block": {
        "path": "/dev/disk/by-partuuid/a1e5ef7d-9bab-4911-abe5-9075b91d88a4",
        "uuid": "a1e5ef7d-9bab-4911-abe5-9075b91d88a4"
    },
    "block_uuid": "a1e5ef7d-9bab-4911-abe5-9075b91d88a4",
    "bluefs": 1,
    "ceph_fsid": "e4ece518-f2cb-4708-b00f-b6bf511e91d9",
    "cluster_name": "ceph",
    "data": {
        "path": "/dev/sdq1",
        "uuid": "9b88d6ec-87a4-4640-b80e-81d3d56fac15"
    },
    "fsid": "9b88d6ec-87a4-4640-b80e-81d3d56fac15",
    "keyring": "AQBP4opcBeCYOxAA4sOpTthNE6T28WUf4Bgm3w==",
    "kv_backend": "rocksdb",
    "magic": "ceph osd volume v026",
    "mkfs_done": "yes",
    "none": "",
    "ready": "ready",
    "require_osd_release": "",
    "type": "bluestore",
    "whoami": 59
}

Now, "ceph-volume simple activate" modifies the symlink "block" to point to an unstable path:

[root@ceph-adm:ceph-20 ~]# ceph-volume simple activate --file "/etc/ceph/osd/59-9b88d6ec-87a4-4640-b80e-81d3d56fac15.json" --no-systemd
Running command: /usr/bin/mount -v /dev/sdq1 /var/lib/ceph/osd/ceph-59
 stdout: mount: /dev/sdq1 mounted on /var/lib/ceph/osd/ceph-59.
Running command: /usr/bin/ln -snf /dev/sdq2 /var/lib/ceph/osd/ceph-59/block
Running command: /usr/bin/chown -R ceph:ceph /dev/sdq2
--> Skipping enabling of `simple` systemd unit
--> Skipping masking of ceph-disk systemd units
--> Skipping enabling and starting OSD simple systemd unit because --no-systemd was used
--> Successfully activated OSD 59 with FSID 9b88d6ec-87a4-4640-b80e-81d3d56fac15

Its the command "/usr/bin/ln -snf /dev/sdq2 /var/lib/ceph/osd/ceph-59/block" that destroys the integrity of the OSD. If you reboot the machine and the devices get different names, the next execution of "ceph-volume simple scan" will produce a corrupted meta data file. This will also happen if you move a converted OSD to another host and try to scan+start it.

The change of the symbolic link to an unstable device path is a critical bug and I don't even understand why it happens in the first place. There is no point and the only valid link target would be "/dev/disk/by-partuuid/a1e5ef7d-9bab-4911-abe5-9075b91d88a4" any ways.

I can work aroud that by resetting the link to its correct value after activation. However, this should really be fixed.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Chris Dunlop <chris@xxxxxxxxxxxx>
Sent: 03 March 2021 05:06:09
To: Frank Schilder
Cc: ceph-users@xxxxxxx
Subject: Re:  OSD id 241 != my id 248: conversion from "ceph-disk" to "ceph-volume simple" destroys OSDs

Hi Frank,

On Tue, Mar 02, 2021 at 02:58:05PM +0000, Frank Schilder wrote:
> Hi all,
>
> this is a follow-up on "reboot breaks OSDs converted from ceph-disk to ceph-volume simple".
>
> I converted a number of ceph-disk OSDs to ceph-volume using "simple scan" and "simple activate". Somewhere along the way, the OSDs meta-data gets rigged and the prominent symptom is that the symlink block is changes from a part-uuid target to an unstable device name target like:
>
> before conversion:
>
> block -> /dev/disk/by-partuuid/9123be91-7620-495a-a9b7-cc85b1de24b7
>
> after conversion:
>
> block -> /dev/sdj2
>
> This is a huge problem as the "after conversion" device names are unstable. I have now a cluster that I cannot reboot servers on due to this problem. OSDs randomly re-assigned devices will refuse to start with:
>
> 2021-03-02 15:56:21.709 7fb7c2549b80 -1 OSD id 241 != my id 248
>
> Please help me with getting out of this mess.


These paths might be coming from /etc/ceph/osd/*.json files.

Have your tried editing the files to replace /dev/sdXX path with the by-partuuid path?

Cheers,

Chris
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux