Re: ceph-volume claiming wrong device

Oleksiy Stashok <oleksiys@xxxxxxxxxx> · Tue, 1 Nov 2022 10:20:13 -0700

It looks like I hit some flavour of https://tracker.ceph.com/issues/51034.
Since when I set `bluefs_buffered_io=false` the issue (that I could
reproduce pretty consistently) disappeared.

Oleksiy

On Tue, Nov 1, 2022 at 3:02 AM Eugen Block <eblock@xxxxxx> wrote:

> As I said, I would recommend to really wipe the OSDs clean
> (ceph-volume lvm zap --destroy /dev/sdX), maybe reboot (on VMs it was
> sometimes necessary during my tests if I had too many failed
> attempts). And then also make sure you don't have any leftovers in the
> filesystem (under /var/lib/ceph) just to make sure you have a clean
> start.
>
> Zitat von Oleksiy Stashok <oleksiys@xxxxxxxxxx>:
>
> > Hey Eugen,
> >
> > valid points, I first tried to provision OSDs via ceph-ansible (later
> > excluded), which does run the batch command with all 4 disk devices, but
> it
> > often failed with the same issue I mentioned earlier, something like:
> > ```
> > bluefs _replay 0x0: stop: uuid e2f72ec9-2747-82d7-c7f8-41b7b6d41e1b !=
> > super.uuid 0110ddb3-d4bf-4c1e-be11-654598c71db0
> > ```
> > that's why I abandoned that idea and tried to provision OSDs manually one
> > by one.
> > As I mentioned I used ceph-ansible, not cephadm for legacy reasons, but I
> > suspect the problem I'm seeing is related to ceph-volume, so I suspect
> > cephadm won't change it.
> >
> > I did more investigation in 1-by-1 OSD creation flow and it seems like
> the
> > fact that `ceph-volume lvm list` shows me 2 devices belonging to the same
> > OSD can be explained by the following flow:
> >
> > 1. ceph-volume lvm create --bluestore --dmcrypt --data /dev/sdd
> > 2. trying to create osd.2
> > 3. fails with uuid != super.uuid issue
> > 4. ceph-volume lvm list returns /dev/sdd belong to osd.2 (even though it
> > failed)
> > 5. ceph-volume lvm create --bluestore --dmcrypt --data /dev/sde
> > 6. trying to create osd.2 (*again*)
> > 7. succeeds
> > 8. ceph-volume lvm list returns both /dev/sdd and /dev/sde belonging to
> > osd.2
> >
> > osd.2 is reported to be up and running.
> >
> > Any idea why this is happening?
> > Thank you!
> > Oleksiy
> >
> > On Thu, Oct 27, 2022 at 12:11 AM Eugen Block <eblock@xxxxxx> wrote:
> >
> >> Hi,
> >>
> >> first of all, if you really need to issue ceph-volume manually,
> >> there's a batch command:
> >>
> >> cephadm ceph-volume lvm batch /dev/sdb /dev/sdc /dev/sdd /dev/sde
> >>
> >> Second, are you using cephadm? Maybe your manual intervention
> >> conflicts with the automatic osd setup (all available devices). You
> >> could look into /var/log/ceph/cephadm.log on each node and see if
> >> cephadm already tried to setup the OSDs for you. What does 'ceph orch
> >> ls' show?
> >> Did you end up having online OSDs or did it fail? In that case I would
> >> purge all OSDs from the crushmap, then wipe all devices (ceph-volume
> >> lvm zap --destroy /dev/sdX) and either let cephadm create the OSDs for
> >> you or you disable that (unmanaged=true) and run the manual steps
> >> again (although it's not really necessary).
> >>
> >> Regards,
> >> Eugen
> >>
> >> Zitat von Oleksiy Stashok <oleksiys@xxxxxxxxxx>:
> >>
> >> > Hey guys,
> >> >
> >> > I ran into a weird issue, hope you can explain what I'm observing. I'm
> >> > testing* Ceph 16.2.10* on *Ubuntu 20.04* in *Google Cloud VMs*, I
> >> created 3
> >> > instances and attached 4 persistent SSD disks to each instance. I can
> see
> >> > these disks attached as `/dev/sdb, /dev/sdc, /dev/sdd, /dev/sde`
> devices.
> >> >
> >> > As a next step I used ceph-ansible to bootstrap the ceph cluster on 3
> >> > instances, however I intentionally skipped OSD setup. So I ended up
> with
> >> a
> >> > Ceph cluster w/o any OSD.
> >> >
> >> > I ssh'ed into each VM and ran:
> >> >
> >> > ```
> >> >       sudo -s
> >> >       for dev in sdb sdc sdd sde; do
> >> >         /usr/sbin/ceph-volume --cluster ceph lvm create --bluestore
> >> > --dmcrypt --data "/dev/$dev"
> >> >       done
> >> > ```
> >> >
> >> > The operation above randomly fails on random instances/devices with
> >> > something like:
> >> > ```
> >> > bluefs _replay 0x0: stop: uuid e2f72ec9-2747-82d7-c7f8-41b7b6d41e1b !=
> >> > super.uuid 0110ddb3-d4bf-4c1e-be11-654598c71db0
> >> > ```
> >> >
> >> > The interesting this is that when I do
> >> > ```
> >> > /usr/sbin/ceph-volume lvm ls
> >> > ```
> >> >
> >> > I can see that the device for which OSD creation failed actually
> belongs
> >> to
> >> > a different OSD that was previously created for a different device.
> For
> >> > example the failure I mentioned above happened on the `/dev/sde`
> device,
> >> so
> >> > when I list lvms I see this:
> >> > ```
> >> > ====== osd.2 =======
> >> >
> >> >   [block]
> >> >
> >>
> /dev/ceph-103a4373-dbe0-43d6-a9e0-34db4e1b257c/osd-block-9af542ba-fd65-4355-ad17-7293856acaeb
> >> >
> >> >       block device
> >> >
> >> >
> >>
> /dev/ceph-103a4373-dbe0-43d6-a9e0-34db4e1b257c/osd-block-9af542ba-fd65-4355-ad17-7293856acaeb
> >> >       block uuid                FfFnLt-h33F-F73V-tY45-VuZM-scj7-C3dg1K
> >> >       cephx lockbox secret
> AQAlelljqNPoMhAA59JwN3wGt0d6Si+nsnxsRQ==
> >> >       cluster fsid              348fff8e-e850-4774-9694-05d5414b1c53
> >> >       cluster name              ceph
> >> >       crush device class
> >> >       encrypted                 1
> >> >       osd fsid                  9af542ba-fd65-4355-ad17-7293856acaeb
> >> >       osd id                    2
> >> >       osdspec affinity
> >> >       type                      block
> >> >       vdo                       0
> >> >       devices                   /dev/sdd
> >> >
> >> >   [block]
> >> >
> >>
> /dev/ceph-df14969f-2dfb-45f1-a579-a8e23ec12e33/osd-block-4686f6fc-8dc1-48fd-a2d9-70a281c8ee64
> >> >
> >> >       block device
> >> >
> >> >
> >>
> /dev/ceph-df14969f-2dfb-45f1-a579-a8e23ec12e33/osd-block-4686f6fc-8dc1-48fd-a2d9-70a281c8ee64
> >> >       block uuid                GEajK3-Tsyf-XZS9-E5ik-M1BB-VIpb-q7D1ET
> >> >       cephx lockbox secret
> AQAwelljFw2nJBAApuMs2WE0TT+7c1TGa4xQzg==
> >> >       cluster fsid              348fff8e-e850-4774-9694-05d5414b1c53
> >> >       cluster name              ceph
> >> >       crush device class
> >> >       encrypted                 1
> >> >       osd fsid                  4686f6fc-8dc1-48fd-a2d9-70a281c8ee64
> >> >       osd id                    2
> >> >       osdspec affinity
> >> >       type                      block
> >> >       vdo                       0
> >> >       devices                   /dev/sde
> >> > ```
> >> >
> >> > How did it happen that `/dev/sde/ was claimed by osd.2?
> >> >
> >> > Thank you!
> >> > Oleksiy
> >> > _______________________________________________
> >> > ceph-users mailing list -- ceph-users@xxxxxxx
> >> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >>
> >>
> >>
> >> _______________________________________________
> >> ceph-users mailing list -- ceph-users@xxxxxxx
> >> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >>
>
>
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx