Re: ceph-volume and automatic OSD provisioning

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jun 21, 2018 at 9:34 AM, Erwan Velu <evelu@xxxxxxxxxx> wrote:
> The idea of making an automatic configuration is tied with the concept of opiniating what should be the kind of devices to associate.
>
> You spoke about using a ratio between SSDs and HDDs to get a good setup.
> What should be the behavior of the tool if the ratio :
> - cannot be reach (not enough HDDs for 1 SSD) ?

If a ratio cannot be met, this is an error condition that would be reported

> - is exceeded (if we have 1 more HDD than expected, shall it be included or left away alone ?)

If we can't accommodate (like in the previous item) we would error
with a message describing the reason.

>
> If we have several SSDs free, which one should be used ?

All of them

> If we have multiple HDDs types (10/15K/7.2K) how be sure they are used in the same 'auto' setup ?

There is no distinction for HDDs other than size (and being rotational)

> If we have 1 SSD and 1 NVMe, which one is preferred ?

They are treated the same

> What if there is some devices that should not be used by ceph-volume ? Does it imply using the manual mode ?

3 options:

- manually create the vg/lvs (supported today)
- pass an input of the devices wanted as an argument
- use a higher up tool like ceph-ansible

>
> When the user is providing a list of devices, do we agree that they have to be checked against the "rejecting" rules to avoid using a wrong device ?

We have some rules/checks that we would go through for devices.
However, most of these aren't even in place in ceph-volume today. For
example, we don't check if a raw device
is read-only, or if it is part of a removable media. We assume the
caller knows best.

>
> ----- Mail original -----
> De: "Alfredo Deza" <adeza@xxxxxxxxxx>
> À: "ceph-devel" <ceph-devel@xxxxxxxxxxxxxxx>
> Envoyé: Mardi 19 Juin 2018 21:35:02
> Objet: ceph-volume and automatic OSD provisioning
>
> One of the top questions for ceph-volume has been "why this doesn't create
> partitions like ceph-disk does?". Although we have initially focused on LVM,
> the same question is true (except for LVs instead of partitions). Now
> that ceph-volume is
> stabilizing, we can expand on a more user-friendly approach.
>
> We are planning on creating an interface to size devices automatically based on
> some simple criteria. There are three distinct use cases that we are going to
> support, that should allow easy OSD provisioning with defaults, to more
> esoteric use cases with third-party systems (like rook, ceph-ansible, seasalt,
> etc...)
>
> This is being implemented as a separate sub-command to avoid pilling up the
> complexity on the existing `lvm` one, and reflect the automation behind it.
>
> Here are some examples on how the API is being designed, for fully automatic
> configuration, semi-automatic (allows input), and manual via a config
> management system:
>
> Automatic (no configuration or options required):
> -------------------------------------------------
>
> Single device type:
>
>     $ ceph-volume auto
>      Use --yes to run
>      Detected devices:
>        [rotational] /dev/sda    1TB
>        [rotational] /dev/sdb    1TB
>        [rotational] /dev/sdc    1TB
>
>      Expected Bluestore OSDs:
>
>       data: /dev/sda (100%)
>       data: /dev/sdb (100%)
>       data: /dev/sdc (100%)
>
> This scenario will detect a single type of unused device (rotational)
> so the bluestore
> OSD will be created on each without block.db or block.wal
>
>
> Mixed devices:
>
>     $ ceph-volume auto
>      Use --yes to run
>      Detected devices:
>        [rotational] /dev/sda    1TB
>        [rotational] /dev/sdb    1TB
>        [rotational] /dev/sdc    1TB
>        [solid     ] /dev/sdd    500GB
>
>      Expected Bluestore OSDs:
>
>       data: /dev/sda (100%), block.db: /dev/sdd (33%)
>       data: /dev/sdb (100%), block.db: /dev/sdd (33%)
>       data: /dev/sdc (100%), block.db: /dev/sdd (33%)
>
> This scenario will detect the unused devices in the system and understand that
> there is a mix of solid and rotational devices, will place block on the
> rotational ones, and will split the ssd in as many rotational devices found (3
> in this case).
>
>
> Semi configurable outcome, with input:
> --------------------------------------
> A user might not want to consume the devices that were automatically detected
> in the system as free, so the interface will allow to pass these devices
> directly as input.
>
>     $ ceph-volume auto /dev/sda /dev/sdb /dev/sdc
>      Device information:
>        [rotational] /dev/sda    1TB
>        [rotational] /dev/sdb    1TB
>        [rotational] /dev/sdc    1TB
>
>      Expected Bluestore OSDs:
>
>       data: /dev/sda (100%), block.db: /dev/sdd (33%)
>       data: /dev/sdb (100%), block.db: /dev/sdd (33%)
>       data: /dev/sdc (100%), block.db: /dev/sdd (33%)
>
>     Please hit Enter to continue, or Ctrl-C to cancel
>
> Similarly, for mixed devices:
>
>     $ ceph-volume auto /dev/sda /dev/sdb /dev/sdc /dev/sdd
>      Use --yes to run
>      Device information:
>        [rotational] /dev/sda    1TB
>        [rotational] /dev/sdb    1TB
>        [rotational] /dev/sdc    1TB
>        [solid     ] /dev/sdd    500GB
>
>      Expected Bluestore OSDs:
>
>       data: /dev/sda (100%), block.db: /dev/sdd (33%)
>       data: /dev/sdb (100%), block.db: /dev/sdd (33%)
>       data: /dev/sdc (100%), block.db: /dev/sdd (33%)
>
>     Please hit Enter to continue, or Ctrl-C to cancel
>
>
> Fully Manual (config management systems):
> -----------------------------------------
> A JSON file or a blob as a positional arugment would allow fine tunning other
> specifics, like using 2 OSDs per NVMe device, determine an exact size for
> a block.db or even a block.wal LV.
>
>     $ ceph-volume auto /etc/ceph/custom_osd_provisioning.json
>
> Or:
>
>     $ ceph-volume auto "{ ... }"
>
>
> Here the API is still undefined as of now, but the idea is to expand on more
> complex setups that can be better managed by configuration management systems
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux