ceph-volume claiming wrong device

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hey guys,

I ran into a weird issue, hope you can explain what I'm observing. I'm
testing* Ceph 16.2.10* on *Ubuntu 20.04* in *Google Cloud VMs*, I created 3
instances and attached 4 persistent SSD disks to each instance. I can see
these disks attached as `/dev/sdb, /dev/sdc, /dev/sdd, /dev/sde` devices.

As a next step I used ceph-ansible to bootstrap the ceph cluster on 3
instances, however I intentionally skipped OSD setup. So I ended up with a
Ceph cluster w/o any OSD.

I ssh'ed into each VM and ran:

```
      sudo -s
      for dev in sdb sdc sdd sde; do
        /usr/sbin/ceph-volume --cluster ceph lvm create --bluestore
--dmcrypt --data "/dev/$dev"
      done
```

The operation above randomly fails on random instances/devices with
something like:
```
bluefs _replay 0x0: stop: uuid e2f72ec9-2747-82d7-c7f8-41b7b6d41e1b !=
super.uuid 0110ddb3-d4bf-4c1e-be11-654598c71db0
```

The interesting this is that when I do
```
/usr/sbin/ceph-volume lvm ls
```

I can see that the device for which OSD creation failed actually belongs to
a different OSD that was previously created for a different device. For
example the failure I mentioned above happened on the `/dev/sde` device, so
when I list lvms I see this:
```
====== osd.2 =======

  [block]
/dev/ceph-103a4373-dbe0-43d6-a9e0-34db4e1b257c/osd-block-9af542ba-fd65-4355-ad17-7293856acaeb

      block device
 /dev/ceph-103a4373-dbe0-43d6-a9e0-34db4e1b257c/osd-block-9af542ba-fd65-4355-ad17-7293856acaeb
      block uuid                FfFnLt-h33F-F73V-tY45-VuZM-scj7-C3dg1K
      cephx lockbox secret      AQAlelljqNPoMhAA59JwN3wGt0d6Si+nsnxsRQ==
      cluster fsid              348fff8e-e850-4774-9694-05d5414b1c53
      cluster name              ceph
      crush device class
      encrypted                 1
      osd fsid                  9af542ba-fd65-4355-ad17-7293856acaeb
      osd id                    2
      osdspec affinity
      type                      block
      vdo                       0
      devices                   /dev/sdd

  [block]
/dev/ceph-df14969f-2dfb-45f1-a579-a8e23ec12e33/osd-block-4686f6fc-8dc1-48fd-a2d9-70a281c8ee64

      block device
 /dev/ceph-df14969f-2dfb-45f1-a579-a8e23ec12e33/osd-block-4686f6fc-8dc1-48fd-a2d9-70a281c8ee64
      block uuid                GEajK3-Tsyf-XZS9-E5ik-M1BB-VIpb-q7D1ET
      cephx lockbox secret      AQAwelljFw2nJBAApuMs2WE0TT+7c1TGa4xQzg==
      cluster fsid              348fff8e-e850-4774-9694-05d5414b1c53
      cluster name              ceph
      crush device class
      encrypted                 1
      osd fsid                  4686f6fc-8dc1-48fd-a2d9-70a281c8ee64
      osd id                    2
      osdspec affinity
      type                      block
      vdo                       0
      devices                   /dev/sde
```

How did it happen that `/dev/sde/ was claimed by osd.2?

Thank you!
Oleksiy
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux