Re: osds won't start

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Can you share some more information how exactly you upgraded? It looks like a cephadm managed cluster. Did you intall OS updates on all nodes without waiting for the first one to recover? Maybe I'm misreading so please clarify what your update process looked like.


Zitat von Mazzystr <mazzystr@xxxxxxxxx>:

I applied latest os updates and rebooted my hosts.  Now all my osds fail to
start.

# cat /etc/os-release
NAME="openSUSE Tumbleweed"
# VERSION="20220207"
ID="opensuse-tumbleweed"
ID_LIKE="opensuse suse"
VERSION_ID="20220207"

# uname -a
Linux cube 5.16.5-1-default #1 SMP PREEMPT Thu Feb 3 05:26:48 UTC 2022
(1af4009) x86_64 x86_64 x86_64 GNU/Linux

container image: v16.2.7 / v16.2.7-20220201

osd debug log shows the following
  -11> 2022-02-10T19:14:48.383-0800 7ff1be4c3080  1 bluefs add_block_device
bdev 0 path /var/lib/ceph/osd/ceph-0/block.wal size 50 GiB
   -10> 2022-02-10T19:14:48.383-0800 7ff1be4c3080  1  set rocksdb option
max_total_wal_size = 1073741824
    -9> 2022-02-10T19:14:48.383-0800 7ff1be4c3080  1  set rocksdb option
compaction_readahead_size = 2097152
    -8> 2022-02-10T19:14:48.383-0800 7ff1be4c3080  1  set rocksdb option
max_write_buffer_number = 4
    -7> 2022-02-10T19:14:48.383-0800 7ff1be4c3080  1  set rocksdb option
max_background_compactions = 2
    -6> 2022-02-10T19:14:48.383-0800 7ff1be4c3080  1  set rocksdb option
compression = kNoCompression
    -5> 2022-02-10T19:14:48.383-0800 7ff1be4c3080  1  set rocksdb option
writable_file_max_buffer_size = 0
    -4> 2022-02-10T19:14:48.383-0800 7ff1be4c3080  1  set rocksdb option
min_write_buffer_number_to_merge = 1
    -3> 2022-02-10T19:14:48.383-0800 7ff1be4c3080  1  set rocksdb option
recycle_log_file_num = 4
    -2> 2022-02-10T19:14:48.383-0800 7ff1be4c3080  1  set rocksdb option
write_buffer_size = 268435456
    -1> 2022-02-10T19:14:48.383-0800 7ff1be4c3080  1 bluefs mount
     0> 2022-02-10T19:14:48.387-0800 7ff1be4c3080 -1 *** Caught signal
(Aborted) **
 in thread 7ff1be4c3080 thread_name:ceph-osd

 ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific
(stable)
 1: /lib64/libpthread.so.0(+0x12c20) [0x7ff1bc465c20]
 2: gsignal()
 3: abort()
 4: /lib64/libstdc++.so.6(+0x9009b) [0x7ff1bba7c09b]
 5: /lib64/libstdc++.so.6(+0x9653c) [0x7ff1bba8253c]
 6: /lib64/libstdc++.so.6(+0x96597) [0x7ff1bba82597]
 7: /lib64/libstdc++.so.6(+0x967f8) [0x7ff1bba827f8]
 8: ceph-osd(+0x56301f) [0x559ff6d6301f]
 9: (BlueFS::_open_super()+0x18c) [0x559ff745f08c]
 10: (BlueFS::mount()+0xeb) [0x559ff748085b]
 11: (BlueStore::_open_bluefs(bool, bool)+0x94) [0x559ff735e464]
 12: (BlueStore::_prepare_db_environment(bool, bool,
std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> >*, std::__cxx11::basic_string<char,
std::char_traits<char>, std::allocator<char> >*)+0x6d9) [0x559ff735f5b9]
 13: (BlueStore::_open_db(bool, bool, bool)+0x155) [0x559ff73608b5]
 14: (BlueStore::_open_db_and_around(bool, bool)+0x273) [0x559ff73cba33]
 15: (BlueStore::_mount()+0x204) [0x559ff73ce974]
 16: (OSD::init()+0x380) [0x559ff6ea2400]
 17: main()
 18: __libc_start_main()
 19: _start()
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed
to interpret this.


The process log shows the following
2022-02-10T19:33:31.852-0800 7f22869e8080 -1 WARNING: the following
dangerous and experimental features are enabled: bluestore,rocksdb
2022-02-10T19:33:31.852-0800 7f22869e8080 -1 WARNING: the following
dangerous and experimental features are enabled: bluestore,rocksdb
2022-02-10T19:33:31.852-0800 7f22869e8080 -1 WARNING: the following
dangerous and experimental features are enabled: bluestore,rocksdb
terminate called after throwing an instance of
'ceph::buffer::v15_2_0::malformed_input'
  what():  void
bluefs_super_t::decode(ceph::buffer::v15_2_0::list::const_iterator&) no
longer understand old encoding version 2 < 143: Malformed input
*** Caught signal (Aborted) **
 in thread 7f22869e8080 thread_name:ceph-osd
 ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific
(stable)
 1: /lib64/libpthread.so.0(+0x12c20) [0x7f228498ac20]
 2: gsignal()
 3: abort()
 4: /lib64/libstdc++.so.6(+0x9009b) [0x7f2283fa109b]
 5: /lib64/libstdc++.so.6(+0x9653c) [0x7f2283fa753c]
 6: /lib64/libstdc++.so.6(+0x96597) [0x7f2283fa7597]
 7: /lib64/libstdc++.so.6(+0x967f8) [0x7f2283fa77f8]
 8: ceph-osd(+0x56301f) [0x55e6faf6301f]
 9: (BlueFS::_open_super()+0x18c) [0x55e6fb65f08c]
 10: (BlueFS::mount()+0xeb) [0x55e6fb68085b]
 11: (BlueStore::_open_bluefs(bool, bool)+0x94) [0x55e6fb55e464]
 12: (BlueStore::_prepare_db_environment(bool, bool,
std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> >*, std::__cxx11::basic_string<char,
std::char_traits<char>, std::allocator<char> >*)+0x6d9) [0x55e6fb55f5b9]
 13: (BlueStore::_open_db(bool, bool, bool)+0x155) [0x55e6fb5608b5]
 14: (BlueStore::_open_db_and_around(bool, bool)+0x273) [0x55e6fb5cba33]
 15: (BlueStore::_mount()+0x204) [0x55e6fb5ce974]
 16: (OSD::init()+0x380) [0x55e6fb0a2400]
 17: main()
 18: __libc_start_main()
 19: _start()
2022-02-10T19:33:34.620-0800 7f22869e8080 -1 *** Caught signal (Aborted) **
 in thread 7f22869e8080 thread_name:ceph-osd


Doesn't anyone have any ideas what could be going on here?

Thanks,
/Chris
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux