Re: Shared WAL/DB device partition for multiple OSDs?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Aug 23, 2018 at 9:56 AM, Hervé Ballans
<herve.ballans@xxxxxxxxxxxxx> wrote:
> Le 23/08/2018 à 15:20, Alfredo Deza a écrit :
>
> Thanks Alfredo for your reply. I'm using the very last version of Luminous
> (12.2.7) and ceph-deploy (2.0.1).
> I have no problem in creating my OSD, that's work perfectly.
> My issue only concerns the problem of the mount names of the NVMe partitions
> which change after a reboot when there are more than one NVMe device on the
> OSD node.
>
> ceph-volume is pretty resilient to partition changes because it stores
> the PARTUUID of the partition in LVM, and it queries
> it each time at boot. Note that for bluestore there is no mounting
> whatsoever. Have you created partitions with a PARTUUID on the nvme
> devices for block.db ?
>
>
> Here is how I created my BlueStore OSDs (in the first OSD node) :
>
> 1) On the OSD node node-osd0, I first created block partitions on the NVMe
> device (PM1725a 800GB), like this :
>
> # parted /dev/nvme0n1 mklabel gpt
>
> # echo "1 0 10
> 2 10 20
> 3 20 30
> 4 30 40
> 5 40 50
> 6 50 60
> 7 60 70
> 8 70 80
> 9 80 90
> 10 90 100" | while read num beg end; do parted /dev/nvme0n1 mkpart $num
> $beg% $end%; done
>
> Extract of cat /proc/partitions :
>
>  259        2  781412184 nvme1n1
>  259        3  781412184 nvme0n1
>  259        5   78140416 nvme0n1p1
>  259        6   78141440 nvme0n1p2
>  259        7   78140416 nvme0n1p3
>  259        8   78141440 nvme0n1p4
>  259        9   78141440 nvme0n1p5
>  259       10   78141440 nvme0n1p6
>  259       11   78140416 nvme0n1p7
>  259       12   78141440 nvme0n1p8
>  259       13   78141440 nvme0n1p9
>  259       15   78140416 nvme0n1p10
>
> 2) Then, from the admin node, I created my 10 first OSDs like this :
>
> echo "/dev/sda /dev/nvme0n1p1
> /dev/sdb /dev/nvme0n1p2
> /dev/sdc /dev/nvme0n1p3
> /dev/sdd /dev/nvme0n1p4
> /dev/sde /dev/nvme0n1p5
> /dev/sdf /dev/nvme0n1p6
> /dev/sdg /dev/nvme0n1p7
> /dev/sdh /dev/nvme0n1p8
> /dev/sdi /dev/nvme0n1p9
> /dev/sdj /dev/nvme0n1p10" | while read hdd db; do ceph-deploy osd create
> --debug --bluestore --data $hdd --block-db $db node-osd0; done
>
> What you mean is that, at this stage, I must directly declare the UUID paths
> in value of --block.db (i.e. replace /dev/nvme0n1p1 with its PARTUUID), that
> is ?

No, this all looks correct. How does the ceph-volume.log and
ceph-volume-systemd.log look when you are booting up for the OSDs that
aren't coming up?

Anything useful in there?
>
> Currently, I created 60 OSDs like that. The ceph cluster is HEALTH_OK and
> all osds are up and in. But I'm not yet in prodcution and there is only test
> data on it, so I can destroy everything and rebuild my OSDs.
> That's what you advise me to do there, taking care to specify the PARTUUID
> for the block.db instead of the device names ?
>
>
> For instance, if I have two NVMe devices, the first time, the first device
> is mounted with name /dev/nvme0n1 and the second device with name
> /dev/nvme1n1. After node restart, these names can be reversed, that is, the
> first device named /dev/nvme1n1 and the second one /dev/nvme0n1 ! The result
> is that OSDs no longer find their metadata and do not start up...
>
> This sounds very odd. Could you clarify where block and block.db are?
> Also useful here would be to take a look at
> /var/log/ceph/ceph-volume-systemd.log and ceph-volume.log to
> see how ceph-volume is trying to get this OSD up and running.
>
> Also useful would be to check `ceph-volume lvm list` to verify that
> regardless of the name change, it recognizes the correct partition
> mapped to the OSD
>
> Oops !
>
> # ceph-volume lvm list
> -->  KeyError: 'devices'

Can you re-run this like:


    CEPH_VOLUME_DEBUG=1 ceph-volume lvm list

And paste the output? I think this has been fixed since, but want to
double check

>
> Thank you again,
> Hervé
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux