Can you share some more information how exactly you upgraded? It looks
like a cephadm managed cluster. Did you intall OS updates on all nodes
without waiting for the first one to recover? Maybe I'm misreading so
please clarify what your update process looked like.
Zitat von Mazzystr <mazzystr@xxxxxxxxx>:
I applied latest os updates and rebooted my hosts. Now all my osds fail to
start.
# cat /etc/os-release
NAME="openSUSE Tumbleweed"
# VERSION="20220207"
ID="opensuse-tumbleweed"
ID_LIKE="opensuse suse"
VERSION_ID="20220207"
# uname -a
Linux cube 5.16.5-1-default #1 SMP PREEMPT Thu Feb 3 05:26:48 UTC 2022
(1af4009) x86_64 x86_64 x86_64 GNU/Linux
container image: v16.2.7 / v16.2.7-20220201
osd debug log shows the following
-11> 2022-02-10T19:14:48.383-0800 7ff1be4c3080 1 bluefs add_block_device
bdev 0 path /var/lib/ceph/osd/ceph-0/block.wal size 50 GiB
-10> 2022-02-10T19:14:48.383-0800 7ff1be4c3080 1 set rocksdb option
max_total_wal_size = 1073741824
-9> 2022-02-10T19:14:48.383-0800 7ff1be4c3080 1 set rocksdb option
compaction_readahead_size = 2097152
-8> 2022-02-10T19:14:48.383-0800 7ff1be4c3080 1 set rocksdb option
max_write_buffer_number = 4
-7> 2022-02-10T19:14:48.383-0800 7ff1be4c3080 1 set rocksdb option
max_background_compactions = 2
-6> 2022-02-10T19:14:48.383-0800 7ff1be4c3080 1 set rocksdb option
compression = kNoCompression
-5> 2022-02-10T19:14:48.383-0800 7ff1be4c3080 1 set rocksdb option
writable_file_max_buffer_size = 0
-4> 2022-02-10T19:14:48.383-0800 7ff1be4c3080 1 set rocksdb option
min_write_buffer_number_to_merge = 1
-3> 2022-02-10T19:14:48.383-0800 7ff1be4c3080 1 set rocksdb option
recycle_log_file_num = 4
-2> 2022-02-10T19:14:48.383-0800 7ff1be4c3080 1 set rocksdb option
write_buffer_size = 268435456
-1> 2022-02-10T19:14:48.383-0800 7ff1be4c3080 1 bluefs mount
0> 2022-02-10T19:14:48.387-0800 7ff1be4c3080 -1 *** Caught signal
(Aborted) **
in thread 7ff1be4c3080 thread_name:ceph-osd
ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific
(stable)
1: /lib64/libpthread.so.0(+0x12c20) [0x7ff1bc465c20]
2: gsignal()
3: abort()
4: /lib64/libstdc++.so.6(+0x9009b) [0x7ff1bba7c09b]
5: /lib64/libstdc++.so.6(+0x9653c) [0x7ff1bba8253c]
6: /lib64/libstdc++.so.6(+0x96597) [0x7ff1bba82597]
7: /lib64/libstdc++.so.6(+0x967f8) [0x7ff1bba827f8]
8: ceph-osd(+0x56301f) [0x559ff6d6301f]
9: (BlueFS::_open_super()+0x18c) [0x559ff745f08c]
10: (BlueFS::mount()+0xeb) [0x559ff748085b]
11: (BlueStore::_open_bluefs(bool, bool)+0x94) [0x559ff735e464]
12: (BlueStore::_prepare_db_environment(bool, bool,
std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> >*, std::__cxx11::basic_string<char,
std::char_traits<char>, std::allocator<char> >*)+0x6d9) [0x559ff735f5b9]
13: (BlueStore::_open_db(bool, bool, bool)+0x155) [0x559ff73608b5]
14: (BlueStore::_open_db_and_around(bool, bool)+0x273) [0x559ff73cba33]
15: (BlueStore::_mount()+0x204) [0x559ff73ce974]
16: (OSD::init()+0x380) [0x559ff6ea2400]
17: main()
18: __libc_start_main()
19: _start()
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed
to interpret this.
The process log shows the following
2022-02-10T19:33:31.852-0800 7f22869e8080 -1 WARNING: the following
dangerous and experimental features are enabled: bluestore,rocksdb
2022-02-10T19:33:31.852-0800 7f22869e8080 -1 WARNING: the following
dangerous and experimental features are enabled: bluestore,rocksdb
2022-02-10T19:33:31.852-0800 7f22869e8080 -1 WARNING: the following
dangerous and experimental features are enabled: bluestore,rocksdb
terminate called after throwing an instance of
'ceph::buffer::v15_2_0::malformed_input'
what(): void
bluefs_super_t::decode(ceph::buffer::v15_2_0::list::const_iterator&) no
longer understand old encoding version 2 < 143: Malformed input
*** Caught signal (Aborted) **
in thread 7f22869e8080 thread_name:ceph-osd
ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific
(stable)
1: /lib64/libpthread.so.0(+0x12c20) [0x7f228498ac20]
2: gsignal()
3: abort()
4: /lib64/libstdc++.so.6(+0x9009b) [0x7f2283fa109b]
5: /lib64/libstdc++.so.6(+0x9653c) [0x7f2283fa753c]
6: /lib64/libstdc++.so.6(+0x96597) [0x7f2283fa7597]
7: /lib64/libstdc++.so.6(+0x967f8) [0x7f2283fa77f8]
8: ceph-osd(+0x56301f) [0x55e6faf6301f]
9: (BlueFS::_open_super()+0x18c) [0x55e6fb65f08c]
10: (BlueFS::mount()+0xeb) [0x55e6fb68085b]
11: (BlueStore::_open_bluefs(bool, bool)+0x94) [0x55e6fb55e464]
12: (BlueStore::_prepare_db_environment(bool, bool,
std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> >*, std::__cxx11::basic_string<char,
std::char_traits<char>, std::allocator<char> >*)+0x6d9) [0x55e6fb55f5b9]
13: (BlueStore::_open_db(bool, bool, bool)+0x155) [0x55e6fb5608b5]
14: (BlueStore::_open_db_and_around(bool, bool)+0x273) [0x55e6fb5cba33]
15: (BlueStore::_mount()+0x204) [0x55e6fb5ce974]
16: (OSD::init()+0x380) [0x55e6fb0a2400]
17: main()
18: __libc_start_main()
19: _start()
2022-02-10T19:33:34.620-0800 7f22869e8080 -1 *** Caught signal (Aborted) **
in thread 7f22869e8080 thread_name:ceph-osd
Doesn't anyone have any ideas what could be going on here?
Thanks,
/Chris
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx