Re: ceph-volume claiming wrong device

Oleksiy Stashok <oleksiys@xxxxxxxxxx> · Thu, 27 Oct 2022 00:51:25 -0700

Hey Eugen,

valid points, I first tried to provision OSDs via ceph-ansible (later
excluded), which does run the batch command with all 4 disk devices, but it
often failed with the same issue I mentioned earlier, something like:
```
bluefs _replay 0x0: stop: uuid e2f72ec9-2747-82d7-c7f8-41b7b6d41e1b !=
super.uuid 0110ddb3-d4bf-4c1e-be11-654598c71db0
```
that's why I abandoned that idea and tried to provision OSDs manually one
by one.
As I mentioned I used ceph-ansible, not cephadm for legacy reasons, but I
suspect the problem I'm seeing is related to ceph-volume, so I suspect
cephadm won't change it.

I did more investigation in 1-by-1 OSD creation flow and it seems like the
fact that `ceph-volume lvm list` shows me 2 devices belonging to the same
OSD can be explained by the following flow:

1. ceph-volume lvm create --bluestore --dmcrypt --data /dev/sdd
2. trying to create osd.2
3. fails with uuid != super.uuid issue
4. ceph-volume lvm list returns /dev/sdd belong to osd.2 (even though it
failed)
5. ceph-volume lvm create --bluestore --dmcrypt --data /dev/sde
6. trying to create osd.2 (*again*)
7. succeeds
8. ceph-volume lvm list returns both /dev/sdd and /dev/sde belonging to
osd.2

osd.2 is reported to be up and running.

Any idea why this is happening?
Thank you!
Oleksiy

On Thu, Oct 27, 2022 at 12:11 AM Eugen Block <eblock@xxxxxx> wrote:

> Hi,
>
> first of all, if you really need to issue ceph-volume manually,
> there's a batch command:
>
> cephadm ceph-volume lvm batch /dev/sdb /dev/sdc /dev/sdd /dev/sde
>
> Second, are you using cephadm? Maybe your manual intervention
> conflicts with the automatic osd setup (all available devices). You
> could look into /var/log/ceph/cephadm.log on each node and see if
> cephadm already tried to setup the OSDs for you. What does 'ceph orch
> ls' show?
> Did you end up having online OSDs or did it fail? In that case I would
> purge all OSDs from the crushmap, then wipe all devices (ceph-volume
> lvm zap --destroy /dev/sdX) and either let cephadm create the OSDs for
> you or you disable that (unmanaged=true) and run the manual steps
> again (although it's not really necessary).
>
> Regards,
> Eugen
>
> Zitat von Oleksiy Stashok <oleksiys@xxxxxxxxxx>:
>
> > Hey guys,
> >
> > I ran into a weird issue, hope you can explain what I'm observing. I'm
> > testing* Ceph 16.2.10* on *Ubuntu 20.04* in *Google Cloud VMs*, I
> created 3
> > instances and attached 4 persistent SSD disks to each instance. I can see
> > these disks attached as `/dev/sdb, /dev/sdc, /dev/sdd, /dev/sde` devices.
> >
> > As a next step I used ceph-ansible to bootstrap the ceph cluster on 3
> > instances, however I intentionally skipped OSD setup. So I ended up with
> a
> > Ceph cluster w/o any OSD.
> >
> > I ssh'ed into each VM and ran:
> >
> > ```
> >       sudo -s
> >       for dev in sdb sdc sdd sde; do
> >         /usr/sbin/ceph-volume --cluster ceph lvm create --bluestore
> > --dmcrypt --data "/dev/$dev"
> >       done
> > ```
> >
> > The operation above randomly fails on random instances/devices with
> > something like:
> > ```
> > bluefs _replay 0x0: stop: uuid e2f72ec9-2747-82d7-c7f8-41b7b6d41e1b !=
> > super.uuid 0110ddb3-d4bf-4c1e-be11-654598c71db0
> > ```
> >
> > The interesting this is that when I do
> > ```
> > /usr/sbin/ceph-volume lvm ls
> > ```
> >
> > I can see that the device for which OSD creation failed actually belongs
> to
> > a different OSD that was previously created for a different device. For
> > example the failure I mentioned above happened on the `/dev/sde` device,
> so
> > when I list lvms I see this:
> > ```
> > ====== osd.2 =======
> >
> >   [block]
> >
> /dev/ceph-103a4373-dbe0-43d6-a9e0-34db4e1b257c/osd-block-9af542ba-fd65-4355-ad17-7293856acaeb
> >
> >       block device
> >
> >
> /dev/ceph-103a4373-dbe0-43d6-a9e0-34db4e1b257c/osd-block-9af542ba-fd65-4355-ad17-7293856acaeb
> >       block uuid                FfFnLt-h33F-F73V-tY45-VuZM-scj7-C3dg1K
> >       cephx lockbox secret      AQAlelljqNPoMhAA59JwN3wGt0d6Si+nsnxsRQ==
> >       cluster fsid              348fff8e-e850-4774-9694-05d5414b1c53
> >       cluster name              ceph
> >       crush device class
> >       encrypted                 1
> >       osd fsid                  9af542ba-fd65-4355-ad17-7293856acaeb
> >       osd id                    2
> >       osdspec affinity
> >       type                      block
> >       vdo                       0
> >       devices                   /dev/sdd
> >
> >   [block]
> >
> /dev/ceph-df14969f-2dfb-45f1-a579-a8e23ec12e33/osd-block-4686f6fc-8dc1-48fd-a2d9-70a281c8ee64
> >
> >       block device
> >
> >
> /dev/ceph-df14969f-2dfb-45f1-a579-a8e23ec12e33/osd-block-4686f6fc-8dc1-48fd-a2d9-70a281c8ee64
> >       block uuid                GEajK3-Tsyf-XZS9-E5ik-M1BB-VIpb-q7D1ET
> >       cephx lockbox secret      AQAwelljFw2nJBAApuMs2WE0TT+7c1TGa4xQzg==
> >       cluster fsid              348fff8e-e850-4774-9694-05d5414b1c53
> >       cluster name              ceph
> >       crush device class
> >       encrypted                 1
> >       osd fsid                  4686f6fc-8dc1-48fd-a2d9-70a281c8ee64
> >       osd id                    2
> >       osdspec affinity
> >       type                      block
> >       vdo                       0
> >       devices                   /dev/sde
> > ```
> >
> > How did it happen that `/dev/sde/ was claimed by osd.2?
> >
> > Thank you!
> > Oleksiy
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
>
>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx