Re: ceph-volume lvm create leaves half-built OSDs lying around

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Sep 11, 2019 at 6:18 AM Matthew Vernon <mv3@xxxxxxxxxxxx> wrote:
>
> Hi,
>
> We keep finding part-made OSDs (they appear not attached to any host,
> and down and out; but still counting towards the number of OSDs); we
> never saw this with ceph-disk. On investigation, this is because
> ceph-volume lvm create makes the OSD (ID and auth at least) too early in
> the process and is then unable to roll-back cleanly (because the
> bootstrap-osd credential isn't allowed to remove OSDs).
>
> As an example (very truncated):
>
> Running command: /usr/bin/ceph --cluster ceph --name
> client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring
> -i - osd new 20cea174-4c1b-4330-ad33-505a03156c33
> Running command: vgcreate --force --yes
> ceph-9d66ec60-c71b-49e0-8c1a-e74e98eafb0e /dev/sdbh
>  stderr: Device /dev/sdbh not found (or ignored by filtering).
>   Unable to add physical volume '/dev/sdbh' to volume group
> 'ceph-9d66ec60-c71b-49e0-8c1a-e74e98eafb0e'.
> --> Was unable to complete a new OSD, will rollback changes
> --> OSD will be fully purged from the cluster, because the ID was generated
> Running command: ceph osd purge osd.828 --yes-i-really-mean-it
>  stderr: 2019-09-10 15:07:53.396528 7fbca2caf700 -1 auth: unable to find
> a keyring on
> /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,:
> (2) No such file or directory
>  stderr: 2019-09-10 15:07:53.397318 7fbca2caf700 -1 monclient:
> authenticate NOTE: no keyring found; disabled cephx authentication
> 2019-09-10 15:07:53.397334 7fbca2caf700  0 librados: client.admin
> authentication error (95) Operation not supported
>
Ah this is tricky to solve for every case... ceph-volume is doing a
best-effort here

> This is annoying to have to clear up, and it seems to me could be
> avoided by either:
>
> i) ceph-volume should (attempt to) set up the LVM volumes &c before
> making the new OSD id

That would've helped in your particular case where the failure is
observed when trying to create the LV. When the failure is on the Ceph
side... the problem is
similar.

> or
> ii) allow the bootstrap-osd credential to purge OSDs

I wasn't aware that the bootstrap-osd credentials allowed to
purge/destroy OSDs, are you sure this is possible? If it is I think
that would be reasonable to try.

>
> i) seems like clearly the better answer...?
>
> Regards,
>
> Matthew
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux