Hi,
I don't really have any advice but I'm curious how the LV tags look
like (lvs -o lv_tags). Do they point to the correct LVs for the
block.db? Does the 'ceph osd metadata <OSD>' show anything weird? Is
there something useful in the ceph-volume.log
(/var/log/ceph/{FSID}/ceph-volume.log)?
Regards,
Eugen
Zitat von Reto Gysi <rlgysi@xxxxxxxxx>:
Hi ceph community
I noticed the following problem after upgrading my ceph instance on Debian
12.4 from 17.2.7 to 18.2.1:
I had placed bluestore block.db for hdd osd's on raid1/mirrored logical
volumes on 2 nvme devices, so that if a single block.db nvme device fails,
that not all hdd osds fail.
That worked fine under 17.2.7 and had no problems during host/osd restarts.
During the upgrade to 18.2.1 the osd's wouldn't with the block.db on
mirrored lv wouldn't start anymore because the block.db symlink was updated
to pointing to the wrong device mapper device, and the osd startup failed
with error message that block.db device is busy.
OSD1:
2024-01-05T19:56:43.592+0000 7fdde9f43640 -1
bluestore(/var/lib/ceph/osd/ceph-1) _minimal_open_bluefs add block
device(/var/lib/ceph/osd/ceph-1/block.db) returned: (16) Device or resource
busy
2024-01-05T19:56:43.592+0000 7fdde9f43640 -1
bluestore(/var/lib/ceph/osd/ceph-1) _open_db failed to prepare db
environment:
2024-01-05T19:56:43.592+0000 7fdde9f43640 1 bdev(0x55a2d5014000
/var/lib/ceph/osd/ceph-1/block) close
2024-01-05T19:56:43.892+0000 7fdde9f43640 -1 osd.1 0 OSD:init: unable to
mount object store
the symlink was updated to point to
lrwxrwxrwx 1 ceph ceph 111 Jan 5 20:57 block ->
/dev/mapper/ceph--dec5bd7c--d84f--40d9--ba14--6bd8aadf2957-osd--block--cdd02721--6876--4db8--bdb2--12ac6c70127c
lrwxrwxrwx 1 ceph ceph 48 Jan 5 20:57 block.db ->
/dev/mapper/optane-ceph--db--osd1_rimage_1_iorig
the correct symlink would have been:
lrwxrwxrwx 1 ceph ceph 111 Jan 5 20:57 block ->
/dev/mapper/ceph--dec5bd7c--d84f--40d9--ba14--6bd8aadf2957-osd--block--cdd02721--6876--4db8--bdb2--12ac6c70127c
lrwxrwxrwx 1 ceph ceph 48 Jan 5 20:57 block.db ->
/dev/mapper/optane-ceph--db--osd1
To continue with the upgrade I converted one by one all the block.db lvm
logical volumes back to linear volumes, and fixed the symlinks manually.
converting the lv's back to linear was necessary, because even when I fixed
the symlink manually, after a osd restart the symlink would be created
wrong again if the block.db would point to a raid1 lv.
Here's any example how the symlink looked before an osd was touched by the
18.2.1 upgrade:
OSD2:
lrwxrwxrwx 1 ceph ceph 93 Jan 4 03:38 block ->
/dev/ceph-17a894d6-3a64-4e5e-9fa0-8dd3b5f4bf33/osd-block-3cd7a5af-9002-47a7-b4c2-540381d53be7
lrwxrwxrwx 1 ceph ceph 24 Jan 4 03:38 block.db ->
/dev/optane/ceph-db-osd2
Here's what the output of lvs -a -o +devices looked like for OSD1 block.db
device when it was an raid1 lv:
LV VG
Attr LSize Pool Origin
Data% Meta% Move Log Cpy%Sync Convert Devices
ceph-db-osd1 optane
rwi-a-r--- 44.00g
100.00
ceph-db-osd1_rimage_0(0),ceph-db-osd1_rimage_1(0)
[ceph-db-osd1_rimage_0] optane
gwi-aor--- 44.00g
[ceph-db-osd1_rimage_0_iorig] 100.00
ceph-db-osd1_rimage_0_iorig(0)
[ceph-db-osd1_rimage_0_imeta] optane
ewi-ao---- 428.00m
/dev/sdg(55482)
[ceph-db-osd1_rimage_0_imeta] optane
ewi-ao---- 428.00m
/dev/sdg(84566)
[ceph-db-osd1_rimage_0_iorig] optane
-wi-ao---- 44.00g
/dev/sdg(9216)
[ceph-db-osd1_rimage_0_iorig] optane
-wi-ao---- 44.00g
/dev/sdg(82518)
[ceph-db-osd1_rimage_1] optane
gwi-aor--- 44.00g
[ceph-db-osd1_rimage_1_iorig] 100.00
ceph-db-osd1_rimage_1_iorig(0)
[ceph-db-osd1_rimage_1_imeta] optane
ewi-ao---- 428.00m
/dev/sdj(55392)
[ceph-db-osd1_rimage_1_imeta] optane
ewi-ao---- 428.00m
/dev/sdj(75457)
[ceph-db-osd1_rimage_1_iorig] optane
-wi-ao---- 44.00g
/dev/sdj(9218)
[ceph-db-osd1_rimage_1_iorig] optane
-wi-ao---- 44.00g
/dev/sdj(73409)
[ceph-db-osd1_rmeta_0] optane
ewi-aor--- 4.00m
/dev/sdg(55388)
[ceph-db-osd1_rmeta_1] optane
ewi-aor--- 4.00m
/dev/sdj(9217)
It would be good if the symlinks were recreated pointing to the correct
device even when they point to a raid1 lv.
Not sure if this problem has been reported yet.
Cheers
Reto
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx