Le 23/08/2018 à 12:51, Alfredo Deza a écrit :
On Thu, Aug 23, 2018 at 5:42 AM, Hervé Ballans
<herve.ballans@xxxxxxxxxxxxx> wrote:
Hello all,
I would like to continue a thread that dates back to last May (sorry if this
is not a good practice ?..)
Thanks David for your usefil tips on this thread.
In my side, I created my OSDs with ceph-deploy (in place of ceph-volume)
[1], but this is exactly the same context as this mentioned on this thread
(hdd drive for OSDs and wal/db partitions on NVMe device).
The problem I encounter is that the script that fixes block.db partitions by
their UUID works very well in live but does not resist to the reboot of the
OSD node. If I restart the server, the symbolic links of block.db
automatically go up with the device name /dev/nvme...
The problem gets worse when we have 2 NVMe devices on the same node beacuse
in this case, it happens that the paths to the block.db partitions are
reversed and obviously OSDs don't start !
You didn't mention what versions of ceph-deploy and Ceph you are
using. Since you brought up partitions and OSDs that are not coming
up, it seems
that is related to using ceph-disk and ceph-deploy 1.5.X
I would suggest trying out the newer version of ceph-deploy (2.0.X)
and use ceph-volume, the one caveat being if you need a separate
block.db on the NVMe device
you would need to create the LV yourself.
Thanks Alfredo for your reply. I'm using the very last version of
Luminous (12.2.7) and ceph-deploy (2.0.1).
I have no problem in creating my OSD, that's work perfectly.
My issue only concerns the problem of the mount names of the NVMe
partitions which change after a reboot when there are more than one NVMe
device on the OSD node.
For instance, if I have two NVMe devices, the first time, the first
device is mounted with name /dev/nvme0n1 and the second device with name
/dev/nvme1n1. After node restart, these names can be reversed, that is,
the first device named /dev/nvme1n1 and the second one /dev/nvme0n1 !
The result is that OSDs no longer find their metadata and do not start up...
Some of the manual steps are covered in the bluestore config
reference: http://docs.ceph.com/docs/master/rados/configuration/bluestore-config-ref/#block-and-block-db
As I'm not yet in production, I can probably recreate all my OSDs by forcing
the path to the block.db partitions with UUID, but I would like to know if
there was a way to "freeze" the configuration of block.db paths by their
UUID ("a posteriori") ?
Or maybe (but this is more a system administration issue) that there is a
way on Linux system to force an NVMe disk to be mounted with a fixed device
name ? (I specify here that my NVMe partitions do not have a filesystem).
Thanks for your help,
Hervé
[1] from admin node
ceph-deploy osd create --debug --bluestore --data $hdd --block-db $db
$osdnode
Le 11/05/2018 à 18:46, David Turner a écrit :
# Create the OSD
echo "/dev/sdb /dev/nvme0n1p2
/dev/sdc /dev/nvme0n1p3" | while read hdd db; do
ceph-volume lvm create --bluestore --data $hdd --block.db $db
done
# Fix the OSDs to look for the block.db partition by UUID instead of its
device name.
for db in /var/lib/ceph/osd/*/block.db; do
dev=$(readlink $db | grep -Eo nvme[[:digit:]]+n[[:digit:]]+p[[:digit:]]+
|| echo false)
if [[ "$dev" != false ]]; then
uuid=$(ls -l /dev/disk/by-partuuid/ | awk '/'${dev}'$/ {print $9}')
ln -sf /dev/disk/by-partuuid/$uuid $db
fi
done
systemctl restart ceph-osd.target
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com