Hi,
I installed a fresh cluster using cephadm:
- bootstrapped one node
- extended it using to 3 monitor nodes, each running mon + mgr using a
spec file
- added 12 OSDs hosts to the spec file with the following disk rules:
~~~
service_type: osd
service_id: osd_spec_hdd
placement:
label: osd
spec:
data_devices:
model: "HGST HUH721212AL" # HDDs
db_devices:
model: "SAMSUNG MZ7KH1T9" # SATA SSDs
---
service_type: osd
service_id: osd_spec_nvme
placement:
label: osd
spec:
data_devices:
model: "SAMSUNG MZPLL1T6HAJQ-00005" # NVMEs
~~~
OSDs on HDD + SSD were deployed, NVME OSDs were not.
MGRs crashed, one after the other:
debug -65> 2022-07-25T17:06:36.507+0000 7f4a33f80700 5 cephsqlite:
FullPathname: (client.17139) 1: /.mgr:devicehealth/main.db
debug -64> 2022-07-25T17:06:36.507+0000 7f4a34f82700 0 [dashboard
INFO sso] Loading SSO DB version=1
debug -63> 2022-07-25T17:06:36.507+0000 7f4a34f82700 4 mgr get_store
get_store key: mgr/dashboard/ssodb_v1
debug -62> 2022-07-25T17:06:36.507+0000 7f4a34f82700 4
ceph_store_get ssodb_v1 not found
debug -61> 2022-07-25T17:06:36.507+0000 7f4a34f82700 0 [dashboard
INFO root] server: ssl=no host=:: port=8080
debug -60> 2022-07-25T17:06:36.507+0000 7f4a34f82700 0 [dashboard
INFO root] Configured CherryPy, starting engine...
debug -59> 2022-07-25T17:06:36.507+0000 7f4a34f82700 4 mgr set_uri
module dashboard set URI 'http://192.168.14.201:8080/'
debug -58> 2022-07-25T17:06:36.511+0000 7f4a64e91700 4
ceph_store_get active_devices not found
debug -57> 2022-07-25T17:06:36.511+0000 7f4a33f80700 -1 *** Caught
signal (Aborted) **
in thread 7f4a33f80700 thread_name:devicehealth
ceph version 17.2.2 (b6e46b8939c67a6cc754abb4d0ece3c8918eccc3) quincy
(stable)
1: /lib64/libpthread.so.0(+0x12ce0) [0x7f4a9b0d0ce0]
2: gsignal()
3: abort()
4: /lib64/libstdc++.so.6(+0x9009b) [0x7f4a9a4cf09b]
5: /lib64/libstdc++.so.6(+0x9653c) [0x7f4a9a4d553c]
6: /lib64/libstdc++.so.6(+0x96597) [0x7f4a9a4d5597]
7: /lib64/libstdc++.so.6(+0x967f8) [0x7f4a9a4d57f8]
8: (std::__throw_regex_error(std::regex_constants::error_type, char
const*)+0x4a) [0x5607b31d5eea]
9: (bool std::__detail::_Compiler<std::__cxx11::regex_traits<char>
>::_M_expression_term<false,
false>(std::__detail::_Compiler<std::__cxx11::regex>
10: (void std::__detail::_Compiler<std::__cxx11::regex_traits<char>
>::_M_insert_bracket_matcher<false, false>(bool)+0x146) [0x5607b31e26b6]
11: (std::__detail::_Compiler<std::__cxx11::regex_traits<char>
>::_M_bracket_expression()+0x6b) [0x5607b31e663b]
12: (std::__detail::_Compiler<std::__cxx11::regex_traits<char>
>::_M_atom()+0x6a) [0x5607b31e671a]
13: (std::__detail::_Compiler<std::__cxx11::regex_traits<char>
>::_M_alternative()+0xd0) [0x5607b31e6ca0]
14: (std::__detail::_Compiler<std::__cxx11::regex_traits<char>
>::_M_disjunction()+0x30) [0x5607b31e6df0]
15: (std::__detail::_Compiler<std::__cxx11::regex_traits<char>
>::_M_atom()+0x338) [0x5607b31e69e8]
16: (std::__detail::_Compiler<std::__cxx11::regex_traits<char>
>::_M_alternative()+0xd0) [0x5607b31e6ca0]
17: (std::__detail::_Compiler<std::__cxx11::regex_traits<char>
>::_M_alternative()+0x42) [0x5607b31e6c12]
18: (std::__detail::_Compiler<std::__cxx11::regex_traits<char>
>::_M_alternative()+0x42) [0x5607b31e6c12]
19: (std::__detail::_Compiler<std::__cxx11::regex_traits<char>
>::_M_alternative()+0x42) [0x5607b31e6c12]
20: (std::__detail::_Compiler<std::__cxx11::regex_traits<char>
>::_M_alternative()+0x42) [0x5607b31e6c12]
21: (std::__detail::_Compiler<std::__cxx11::regex_traits<char>
>::_M_disjunction()+0x30) [0x5607b31e6df0]
22: (std::__detail::_Compiler<std::__cxx11::regex_traits<char>
>::_Compiler(char const*, char const*, std::locale const&,
std::regex_constants::syn>
23: /lib64/libcephsqlite.so(+0x1b7ca) [0x7f4a9d8ba7ca]
24: /lib64/libcephsqlite.so(+0x24486) [0x7f4a9d8c3486]
25: /lib64/libsqlite3.so.0(+0x75f1c) [0x7f4a9d600f1c]
26: /lib64/libsqlite3.so.0(+0xdd4c9) [0x7f4a9d6684c9]
27: pysqlite_connection_init()
28: /lib64/libpython3.6m.so.1.0(+0x13afc6) [0x7f4a9d182fc6]
29: PyObject_Call()
30:
/lib64/python3.6/lib-dynload/_sqlite3.cpython-36m-x86_64-linux-gnu.so(+0xa1f5)
[0x7f4a8bdf31f5]
31: /lib64/libpython3.6m.so.1.0(+0x19d5f1) [0x7f4a9d1e55f1]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is
needed to interpret this.
Is there anything I can do to recover from this? Is there anything I can
add to help debugging this?
Thank you,
Daniel
--
Daniel Schreiber
Facharbeitsgruppe Systemsoftware
Universitaetsrechenzentrum
Technische Universität Chemnitz
Straße der Nationen 62 (Raum B303)
09111 Chemnitz
Germany
Tel: +49 371 531 35444
Fax: +49 371 531 835444
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx