This problem is solved. My links are indeed swapped host0:/var/lib/ceph/osd/ceph-0 # ls -la block* lrwxrwxrwx 1 ceph ceph 23 Jan 15 15:13 block -> /dev/mapper/ceph-0block lrwxrwxrwx 1 ceph ceph 24 Jan 15 15:13 block.db -> /dev/mapper/ceph--0db lrwxrwxrwx 1 ceph ceph 25 Jan 15 15:13 block.wal -> /dev/mapper/ceph--0wal [root@ceph_osd0 /]# ceph-bluestore-tool show-label --dev /var/lib/ceph/osd/ceph-0/block.db { "/var/lib/ceph/osd/ceph-0/block.db": { "osd_uuid": "7755e0c2-b4bf-4cbe-bc9a-26042d5bdc52", "size": 49998200832, "btime": "2019-04-11T08:46:36.694465-0700", "description": "bluefs wal" } } [root@ceph_osd0 /]# ceph-bluestore-tool show-label --dev /var/lib/ceph/osd/ceph-0/block.wal { "/var/lib/ceph/osd/ceph-0/block.db": { "osd_uuid": "7755e0c2-b4bf-4cbe-bc9a-26042d5bdc52", "size": 49998200832, "btime": "2019-04-11T08:46:36.694465-0700", "description": "bluefs db" } } Good grief! How did I miss the bad LUKS labels! I've been looking at this for two days now! LOL! host0: ~ # lsblk nvme0n1 259:0 0 465.8G 0 disk └─nvme0n1p1 259:1 0 465.8G 0 part ├─vg-ceph--0 254:3 0 1G 0 lvm │ └─ceph-0 254:28 0 1008M 0 crypt /var/lib/ceph/osd/ceph-0 ├─vg-ceph--0wal 254:4 0 1G 0 lvm --->│ └─ceph-0db 254:29 0 1008M 0 crypt ├─vg-ceph--0db 254:5 0 50G 0 lvm --->│ └─ceph-0wal 254:39 0 50G 0 crypt ├─vg-ceph--1 254:6 0 1G 0 lvm I flipped the soft links manually and the osd fires up, mounts the bluestore, and starts pinging all his peeps. This was the result of bad automation that populates our /etc/crypttab. Hopefully this exercise can help the next person with some troubleshooting tips. Thanks, /Chris On Fri, Feb 11, 2022 at 11:09 AM Mazzystr <mazzystr@xxxxxxxxx> wrote: > I set debug {bdev, bluefs, bluestore, osd} = 20/20 and restarted osd.0 > > Logs are here > -15> 2022-02-11T11:07:09.944-0800 7f93546c0080 10 > bluestore(/var/lib/ceph/osd/ceph-0/block.wal) _read_bdev_label got > bdev(osd_uuid 7755e0c2-b4bf-4cbe-bc9a-26042d5bdc52, size 0xba4200000, btime > 2019-04-11T08:46:36.694465-0700, desc bluefs db, 0 meta) > -14> 2022-02-11T11:07:09.944-0800 7f93546c0080 1 set rocksdb option > max_total_wal_size = 1073741824 > -13> 2022-02-11T11:07:09.944-0800 7f93546c0080 1 set rocksdb option > compaction_readahead_size = 2097152 > -12> 2022-02-11T11:07:09.944-0800 7f93546c0080 1 set rocksdb option > max_write_buffer_number = 4 > -11> 2022-02-11T11:07:09.944-0800 7f93546c0080 1 set rocksdb option > max_background_compactions = 2 > -10> 2022-02-11T11:07:09.944-0800 7f93546c0080 1 set rocksdb option > compression = kNoCompression > -9> 2022-02-11T11:07:09.944-0800 7f93546c0080 1 set rocksdb option > writable_file_max_buffer_size = 0 > -8> 2022-02-11T11:07:09.944-0800 7f93546c0080 1 set rocksdb option > min_write_buffer_number_to_merge = 1 > -7> 2022-02-11T11:07:09.944-0800 7f93546c0080 1 set rocksdb option > recycle_log_file_num = 4 > -6> 2022-02-11T11:07:09.944-0800 7f93546c0080 1 set rocksdb option > write_buffer_size = 268435456 > -5> 2022-02-11T11:07:09.944-0800 7f93546c0080 1 bluefs mount > -4> 2022-02-11T11:07:09.944-0800 7f93546c0080 10 bluefs _open_super > -3> 2022-02-11T11:07:09.944-0800 7f93546c0080 5 bdev(0x55d345e82800 > /var/lib/ceph/osd/ceph-0/block.db) read 0x1000~1000 (direct) > -2> 2022-02-11T11:07:09.944-0800 7f93546c0080 20 bdev(0x55d345e82800 > /var/lib/ceph/osd/ceph-0/block.db) _aio_log_start 0x1000~1000 > -1> 2022-02-11T11:07:09.944-0800 7f93546c0080 20 bdev(0x55d345e82800 > /var/lib/ceph/osd/ceph-0/block.db) _aio_log_finish 1 0x1000~1000 > 0> 2022-02-11T11:07:09.948-0800 7f93546c0080 -1 *** Caught signal > (Aborted) ** > in thread 7f93546c0080 thread_name:ceph-osd > > ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific > (stable) > 1: /lib64/libpthread.so.0(+0x12c20) [0x7f9352662c20] > 2: gsignal() > 3: abort() > 4: /lib64/libstdc++.so.6(+0x9009b) [0x7f9351c7909b] > 5: /lib64/libstdc++.so.6(+0x9653c) [0x7f9351c7f53c] > 6: /lib64/libstdc++.so.6(+0x96597) [0x7f9351c7f597] > 7: /lib64/libstdc++.so.6(+0x967f8) [0x7f9351c7f7f8] > 8: ceph-osd(+0x56301f) [0x55d339f6301f] > 9: (BlueFS::_open_super()+0x18c) [0x55d33a65f08c] > 10: (BlueFS::mount()+0xeb) [0x55d33a68085b] > 11: (BlueStore::_open_bluefs(bool, bool)+0x94) [0x55d33a55e464] > 12: (BlueStore::_prepare_db_environment(bool, bool, > std::__cxx11::basic_string<char, std::char_traits<char>, > std::allocator<char> >*, std::__cxx11::basic_string<char, > std::char_traits<char>, std::allocator<char> >*)+0x6d9) [0x55d33a55f5b9] > 13: (BlueStore::_open_db(bool, bool, bool)+0x155) [0x55d33a5608b5] > 14: (BlueStore::_open_db_and_around(bool, bool)+0x273) [0x55d33a5cba33] > 15: (BlueStore::_mount()+0x204) [0x55d33a5ce974] > 16: (OSD::init()+0x380) [0x55d33a0a2400] > 17: main() > 18: __libc_start_main() > 19: _start() > NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed > to interpret this. > > > > > On Fri, Feb 11, 2022 at 10:07 AM Mazzystr <mazzystr@xxxxxxxxx> wrote: > >> I'm suspicious of cross contamination of devices here. I was on CentOS >> for eons until Red Hat shenanigans pinned me to CentOS 7 and nautilus. I >> had very well defined udev rules that ensured dm devices were statically >> set and owned correctly and survived reboots. >> >> I seem to be struggling with this in the openSuSe world. Ownership on my >> devices flip back to root despite my long standing udev rules bing migrated >> over. >> >> I know my paths are correct though. The osd root dirs are also lv's with >> filesystem labels. the block, db, wal links are correct. db and wal are >> lv's named appropriately (yea yea, per Sage that ship has sailed. I LV'd >> too). The osd drives get partitions with osd number labels. bluestore tool >> also confirms. >> >> >> On Fri, Feb 11, 2022 at 9:14 AM Mazzystr <mazzystr@xxxxxxxxx> wrote: >> >>> I forgot to mention I freeze the cluster with 'ceph osd set >>> no{down,out,backfill}'. Then I zyp up all hosts and reboot them. Only >>> when everything is backup do I unset. >>> >>> My client IO patterns allow me to do this since it's a worm data store >>> with long spans of time between writes and reads. I have plenty of time to >>> work with the community and get my store back online. >>> >>> This thread is really for documentation for the next person that comes >>> along with the same problem >>> >>> On Fri, Feb 11, 2022 at 9:08 AM Mazzystr <mazzystr@xxxxxxxxx> wrote: >>> >>>> My clusters are self rolled. My start command is as follows >>>> >>>> podman run -it --privileged --pid=host --cpuset-cpus 0,1 --memory 2g >>>> --name ceph_osd0 --hostname ceph_osd0 -v /dev:/dev -v >>>> /etc/localtime:/etc/localtime:ro -v /etc/ceph:/etc/ceph/ -v >>>> /var/lib/ceph/osd/ceph-0:/var/lib/ceph/osd/ceph-0 -v >>>> /var/log/ceph:/var/log/ceph -v /run/udev/:/run/udev/ >>>> ceph/ceph:v16.2.7-20220201 ceph-osd --id 0 -c /etc/ceph/ceph.conf --cluster >>>> ceph -f >>>> >>>> >>>> I jumped from the octopus img to the 16.2.7 img. I've been running >>>> well for awhile with no issues. The cluster was clean, no backfills in >>>> progressor etc This latest zyp up and reboot and now I have osds that >>>> don't start. >>>> >>>> podman image ls >>>> quay.io/ceph/ceph v16.2.7 >>>> 231fd40524c4 9 days ago 1.39 GB >>>> quay.io/ceph/ceph >>>> v16.2.7-20220201 231fd40524c4 9 days ago 1.39 GB >>>> >>>> >>>> bluefs fails to mount up, I guess? The headers are still readable via >>>> bluestore tool >>>> >>>> ceph-bluestore-tool show-label --dev /dev/mapper/ceph-0block >>>> { >>>> "/dev/mapper/ceph-0block": { >>>> "osd_uuid": "1234abcd-1234-abcd-1234-1234 abcd1234", >>>> "size": 6001171365888, >>>> "btime": "2019-04-11T08:46:36.013428-0700", >>>> "description": "main", >>>> "bfm_blocks": "1465129728", >>>> "bfm_blocks_per_key": "128", >>>> "bfm_bytes_per_block": "4096", >>>> "bfm_size": "6001171365888", >>>> "bluefs": "1", >>>> "ceph_fsid": "1234abcd-1234-abcd-1234-1234 abcd1234", >>>> "kv_backend": "rocksdb", >>>> "magic": "ceph osd volume v026", >>>> "mkfs_done": "yes", >>>> "ready": "ready", >>>> "require_osd_release": "16", >>>> "whoami": "0" >>>> } >>>> } >>>> >>>> >>>> On Fri, Feb 11, 2022 at 1:06 AM Eugen Block <eblock@xxxxxx> wrote: >>>> >>>>> Can you share some more information how exactly you upgraded? It >>>>> looks >>>>> like a cephadm managed cluster. Did you intall OS updates on all >>>>> nodes >>>>> without waiting for the first one to recover? Maybe I'm misreading so >>>>> please clarify what your update process looked like. >>>>> >>>>> >>>>> Zitat von Mazzystr <mazzystr@xxxxxxxxx>: >>>>> >>>>> > I applied latest os updates and rebooted my hosts. Now all my osds >>>>> fail to >>>>> > start. >>>>> > >>>>> > # cat /etc/os-release >>>>> > NAME="openSUSE Tumbleweed" >>>>> > # VERSION="20220207" >>>>> > ID="opensuse-tumbleweed" >>>>> > ID_LIKE="opensuse suse" >>>>> > VERSION_ID="20220207" >>>>> > >>>>> > # uname -a >>>>> > Linux cube 5.16.5-1-default #1 SMP PREEMPT Thu Feb 3 05:26:48 UTC >>>>> 2022 >>>>> > (1af4009) x86_64 x86_64 x86_64 GNU/Linux >>>>> > >>>>> > container image: v16.2.7 / v16.2.7-20220201 >>>>> > >>>>> > osd debug log shows the following >>>>> > -11> 2022-02-10T19:14:48.383-0800 7ff1be4c3080 1 bluefs >>>>> add_block_device >>>>> > bdev 0 path /var/lib/ceph/osd/ceph-0/block.wal size 50 GiB >>>>> > -10> 2022-02-10T19:14:48.383-0800 7ff1be4c3080 1 set rocksdb >>>>> option >>>>> > max_total_wal_size = 1073741824 >>>>> > -9> 2022-02-10T19:14:48.383-0800 7ff1be4c3080 1 set rocksdb >>>>> option >>>>> > compaction_readahead_size = 2097152 >>>>> > -8> 2022-02-10T19:14:48.383-0800 7ff1be4c3080 1 set rocksdb >>>>> option >>>>> > max_write_buffer_number = 4 >>>>> > -7> 2022-02-10T19:14:48.383-0800 7ff1be4c3080 1 set rocksdb >>>>> option >>>>> > max_background_compactions = 2 >>>>> > -6> 2022-02-10T19:14:48.383-0800 7ff1be4c3080 1 set rocksdb >>>>> option >>>>> > compression = kNoCompression >>>>> > -5> 2022-02-10T19:14:48.383-0800 7ff1be4c3080 1 set rocksdb >>>>> option >>>>> > writable_file_max_buffer_size = 0 >>>>> > -4> 2022-02-10T19:14:48.383-0800 7ff1be4c3080 1 set rocksdb >>>>> option >>>>> > min_write_buffer_number_to_merge = 1 >>>>> > -3> 2022-02-10T19:14:48.383-0800 7ff1be4c3080 1 set rocksdb >>>>> option >>>>> > recycle_log_file_num = 4 >>>>> > -2> 2022-02-10T19:14:48.383-0800 7ff1be4c3080 1 set rocksdb >>>>> option >>>>> > write_buffer_size = 268435456 >>>>> > -1> 2022-02-10T19:14:48.383-0800 7ff1be4c3080 1 bluefs mount >>>>> > 0> 2022-02-10T19:14:48.387-0800 7ff1be4c3080 -1 *** Caught >>>>> signal >>>>> > (Aborted) ** >>>>> > in thread 7ff1be4c3080 thread_name:ceph-osd >>>>> > >>>>> > ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) >>>>> pacific >>>>> > (stable) >>>>> > 1: /lib64/libpthread.so.0(+0x12c20) [0x7ff1bc465c20] >>>>> > 2: gsignal() >>>>> > 3: abort() >>>>> > 4: /lib64/libstdc++.so.6(+0x9009b) [0x7ff1bba7c09b] >>>>> > 5: /lib64/libstdc++.so.6(+0x9653c) [0x7ff1bba8253c] >>>>> > 6: /lib64/libstdc++.so.6(+0x96597) [0x7ff1bba82597] >>>>> > 7: /lib64/libstdc++.so.6(+0x967f8) [0x7ff1bba827f8] >>>>> > 8: ceph-osd(+0x56301f) [0x559ff6d6301f] >>>>> > 9: (BlueFS::_open_super()+0x18c) [0x559ff745f08c] >>>>> > 10: (BlueFS::mount()+0xeb) [0x559ff748085b] >>>>> > 11: (BlueStore::_open_bluefs(bool, bool)+0x94) [0x559ff735e464] >>>>> > 12: (BlueStore::_prepare_db_environment(bool, bool, >>>>> > std::__cxx11::basic_string<char, std::char_traits<char>, >>>>> > std::allocator<char> >*, std::__cxx11::basic_string<char, >>>>> > std::char_traits<char>, std::allocator<char> >*)+0x6d9) >>>>> [0x559ff735f5b9] >>>>> > 13: (BlueStore::_open_db(bool, bool, bool)+0x155) [0x559ff73608b5] >>>>> > 14: (BlueStore::_open_db_and_around(bool, bool)+0x273) >>>>> [0x559ff73cba33] >>>>> > 15: (BlueStore::_mount()+0x204) [0x559ff73ce974] >>>>> > 16: (OSD::init()+0x380) [0x559ff6ea2400] >>>>> > 17: main() >>>>> > 18: __libc_start_main() >>>>> > 19: _start() >>>>> > NOTE: a copy of the executable, or `objdump -rdS <executable>` is >>>>> needed >>>>> > to interpret this. >>>>> > >>>>> > >>>>> > The process log shows the following >>>>> > 2022-02-10T19:33:31.852-0800 7f22869e8080 -1 WARNING: the following >>>>> > dangerous and experimental features are enabled: bluestore,rocksdb >>>>> > 2022-02-10T19:33:31.852-0800 7f22869e8080 -1 WARNING: the following >>>>> > dangerous and experimental features are enabled: bluestore,rocksdb >>>>> > 2022-02-10T19:33:31.852-0800 7f22869e8080 -1 WARNING: the following >>>>> > dangerous and experimental features are enabled: bluestore,rocksdb >>>>> > terminate called after throwing an instance of >>>>> > 'ceph::buffer::v15_2_0::malformed_input' >>>>> > what(): void >>>>> > bluefs_super_t::decode(ceph::buffer::v15_2_0::list::const_iterator&) >>>>> no >>>>> > longer understand old encoding version 2 < 143: Malformed input >>>>> > *** Caught signal (Aborted) ** >>>>> > in thread 7f22869e8080 thread_name:ceph-osd >>>>> > ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) >>>>> pacific >>>>> > (stable) >>>>> > 1: /lib64/libpthread.so.0(+0x12c20) [0x7f228498ac20] >>>>> > 2: gsignal() >>>>> > 3: abort() >>>>> > 4: /lib64/libstdc++.so.6(+0x9009b) [0x7f2283fa109b] >>>>> > 5: /lib64/libstdc++.so.6(+0x9653c) [0x7f2283fa753c] >>>>> > 6: /lib64/libstdc++.so.6(+0x96597) [0x7f2283fa7597] >>>>> > 7: /lib64/libstdc++.so.6(+0x967f8) [0x7f2283fa77f8] >>>>> > 8: ceph-osd(+0x56301f) [0x55e6faf6301f] >>>>> > 9: (BlueFS::_open_super()+0x18c) [0x55e6fb65f08c] >>>>> > 10: (BlueFS::mount()+0xeb) [0x55e6fb68085b] >>>>> > 11: (BlueStore::_open_bluefs(bool, bool)+0x94) [0x55e6fb55e464] >>>>> > 12: (BlueStore::_prepare_db_environment(bool, bool, >>>>> > std::__cxx11::basic_string<char, std::char_traits<char>, >>>>> > std::allocator<char> >*, std::__cxx11::basic_string<char, >>>>> > std::char_traits<char>, std::allocator<char> >*)+0x6d9) >>>>> [0x55e6fb55f5b9] >>>>> > 13: (BlueStore::_open_db(bool, bool, bool)+0x155) [0x55e6fb5608b5] >>>>> > 14: (BlueStore::_open_db_and_around(bool, bool)+0x273) >>>>> [0x55e6fb5cba33] >>>>> > 15: (BlueStore::_mount()+0x204) [0x55e6fb5ce974] >>>>> > 16: (OSD::init()+0x380) [0x55e6fb0a2400] >>>>> > 17: main() >>>>> > 18: __libc_start_main() >>>>> > 19: _start() >>>>> > 2022-02-10T19:33:34.620-0800 7f22869e8080 -1 *** Caught signal >>>>> (Aborted) ** >>>>> > in thread 7f22869e8080 thread_name:ceph-osd >>>>> > >>>>> > >>>>> > Doesn't anyone have any ideas what could be going on here? >>>>> > >>>>> > Thanks, >>>>> > /Chris >>>>> > _______________________________________________ >>>>> > ceph-users mailing list -- ceph-users@xxxxxxx >>>>> > To unsubscribe send an email to ceph-users-leave@xxxxxxx >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> ceph-users mailing list -- ceph-users@xxxxxxx >>>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx >>>>> >>>> _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx