yeah, that works if there is a working mgr to send the command to. I was assuming here all the mgr daemons were down since it was a fresh cluster so all the mgrs would have this bugged image. On Wed, Jul 27, 2022 at 12:07 PM Vikhyat Umrao <vikhyat@xxxxxxxxxx> wrote: > Adam - or we could simply redeploy the daemon with the new image? at least > this is something I did in our testing here[1]. > > ceph orch daemon redeploy mgr.<name> quay.ceph.io/ceph-ci/ceph:f516549e3e4815795ff0343ab71b3ebf567e5531 > > [1] https://github.com/ceph/ceph/pull/47270#issuecomment-1196062363 > > On Wed, Jul 27, 2022 at 8:55 AM Adam King <adking@xxxxxxxxxx> wrote: > >> the unit.image file is just there for cpehadm to look at as part of >> gathering metadata I think. What you'd want to edit is the unit.run file >> (in the same directory as the unit.image). It should have a really long >> line specifying a podman/docker run command and somewhere in there will be >> "CONTAINER_IMAGE=<old-image-name>". You'd need to change that to say >> "CONTAINER_IMAGE= >> quay.ceph.io/ceph-ci/ceph:f516549e3e4815795ff0343ab71b3ebf567e5531" then >> restart the service. >> >> On Wed, Jul 27, 2022 at 11:46 AM Daniel Schreiber < >> daniel.schreiber@xxxxxxxxxxxxxxxxxx> wrote: >> >> > Hi Neha, >> > >> > thanks for the quick response. Sorry for that stupid question: to use >> > that image I pull the image on the machine and then change >> > /var/lib/ceph/${clusterid}/mgr.${unit}/unit.image and start the service? >> > >> > Thanks, >> > >> > Daniel >> > >> > Am 27.07.22 um 17:23 schrieb Neha Ojha: >> > > Hi Daniel, >> > > >> > > This issue seems to be showing up in 17.2.2, details in >> > > https://tracker.ceph.com/issues/55304. We are currently in the >> process >> > > of validating the fix https://github.com/ceph/ceph/pull/47270 and >> > > we'll try to expedite a quick fix. >> > > >> > > In the meantime, we have builds/images of the dev version of the fix, >> > > in case you want to give it a try. >> > > https://shaman.ceph.com/builds/ceph/wip-quincy-libcephsqlite-fix/ >> > > quay.ceph.io/ceph-ci/ceph:f516549e3e4815795ff0343ab71b3ebf567e5531 >> > > >> > > Thanks, >> > > Neha >> > > >> > > >> > > >> > > On Wed, Jul 27, 2022 at 8:10 AM Daniel Schreiber >> > > <daniel.schreiber@xxxxxxxxxxxxxxxxxx> wrote: >> > >> >> > >> Hi, >> > >> >> > >> I installed a fresh cluster using cephadm: >> > >> >> > >> - bootstrapped one node >> > >> - extended it using to 3 monitor nodes, each running mon + mgr using >> a >> > >> spec file >> > >> - added 12 OSDs hosts to the spec file with the following disk rules: >> > >> >> > >> ~~~ >> > >> service_type: osd >> > >> service_id: osd_spec_hdd >> > >> placement: >> > >> label: osd >> > >> spec: >> > >> data_devices: >> > >> model: "HGST HUH721212AL" # HDDs >> > >> db_devices: >> > >> model: "SAMSUNG MZ7KH1T9" # SATA SSDs >> > >> >> > >> --- >> > >> >> > >> service_type: osd >> > >> service_id: osd_spec_nvme >> > >> placement: >> > >> label: osd >> > >> spec: >> > >> data_devices: >> > >> model: "SAMSUNG MZPLL1T6HAJQ-00005" # NVMEs >> > >> ~~~ >> > >> >> > >> OSDs on HDD + SSD were deployed, NVME OSDs were not. >> > >> >> > >> MGRs crashed, one after the other: >> > >> >> > >> debug -65> 2022-07-25T17:06:36.507+0000 7f4a33f80700 5 >> cephsqlite: >> > >> FullPathname: (client.17139) 1: /.mgr:devicehealth/main.db >> > >> debug -64> 2022-07-25T17:06:36.507+0000 7f4a34f82700 0 [dashboard >> > >> INFO sso] Loading SSO DB version=1 >> > >> debug -63> 2022-07-25T17:06:36.507+0000 7f4a34f82700 4 mgr >> get_store >> > >> get_store key: mgr/dashboard/ssodb_v1 >> > >> debug -62> 2022-07-25T17:06:36.507+0000 7f4a34f82700 4 >> > >> ceph_store_get ssodb_v1 not found >> > >> debug -61> 2022-07-25T17:06:36.507+0000 7f4a34f82700 0 [dashboard >> > >> INFO root] server: ssl=no host=:: port=8080 >> > >> debug -60> 2022-07-25T17:06:36.507+0000 7f4a34f82700 0 [dashboard >> > >> INFO root] Configured CherryPy, starting engine... >> > >> debug -59> 2022-07-25T17:06:36.507+0000 7f4a34f82700 4 mgr >> set_uri >> > >> module dashboard set URI 'http://192.168.14.201:8080/' >> > >> debug -58> 2022-07-25T17:06:36.511+0000 7f4a64e91700 4 >> > >> ceph_store_get active_devices not found >> > >> debug -57> 2022-07-25T17:06:36.511+0000 7f4a33f80700 -1 *** Caught >> > >> signal (Aborted) ** >> > >> in thread 7f4a33f80700 thread_name:devicehealth >> > >> ceph version 17.2.2 (b6e46b8939c67a6cc754abb4d0ece3c8918eccc3) >> quincy >> > >> (stable) >> > >> 1: /lib64/libpthread.so.0(+0x12ce0) [0x7f4a9b0d0ce0] >> > >> 2: gsignal() >> > >> 3: abort() >> > >> 4: /lib64/libstdc++.so.6(+0x9009b) [0x7f4a9a4cf09b] >> > >> 5: /lib64/libstdc++.so.6(+0x9653c) [0x7f4a9a4d553c] >> > >> 6: /lib64/libstdc++.so.6(+0x96597) [0x7f4a9a4d5597] >> > >> 7: /lib64/libstdc++.so.6(+0x967f8) [0x7f4a9a4d57f8] >> > >> 8: (std::__throw_regex_error(std::regex_constants::error_type, >> char >> > >> const*)+0x4a) [0x5607b31d5eea] >> > >> 9: (bool std::__detail::_Compiler<std::__cxx11::regex_traits<char> >> > >> >::_M_expression_term<false, >> > >> false>(std::__detail::_Compiler<std::__cxx11::regex> >> > >> 10: (void >> std::__detail::_Compiler<std::__cxx11::regex_traits<char> >> > >> >::_M_insert_bracket_matcher<false, false>(bool)+0x146) >> > [0x5607b31e26b6] >> > >> 11: (std::__detail::_Compiler<std::__cxx11::regex_traits<char> >> > >> >::_M_bracket_expression()+0x6b) [0x5607b31e663b] >> > >> 12: (std::__detail::_Compiler<std::__cxx11::regex_traits<char> >> > >> >::_M_atom()+0x6a) [0x5607b31e671a] >> > >> 13: (std::__detail::_Compiler<std::__cxx11::regex_traits<char> >> > >> >::_M_alternative()+0xd0) [0x5607b31e6ca0] >> > >> 14: (std::__detail::_Compiler<std::__cxx11::regex_traits<char> >> > >> >::_M_disjunction()+0x30) [0x5607b31e6df0] >> > >> 15: (std::__detail::_Compiler<std::__cxx11::regex_traits<char> >> > >> >::_M_atom()+0x338) [0x5607b31e69e8] >> > >> 16: (std::__detail::_Compiler<std::__cxx11::regex_traits<char> >> > >> >::_M_alternative()+0xd0) [0x5607b31e6ca0] >> > >> 17: (std::__detail::_Compiler<std::__cxx11::regex_traits<char> >> > >> >::_M_alternative()+0x42) [0x5607b31e6c12] >> > >> 18: (std::__detail::_Compiler<std::__cxx11::regex_traits<char> >> > >> >::_M_alternative()+0x42) [0x5607b31e6c12] >> > >> 19: (std::__detail::_Compiler<std::__cxx11::regex_traits<char> >> > >> >::_M_alternative()+0x42) [0x5607b31e6c12] >> > >> 20: (std::__detail::_Compiler<std::__cxx11::regex_traits<char> >> > >> >::_M_alternative()+0x42) [0x5607b31e6c12] >> > >> 21: (std::__detail::_Compiler<std::__cxx11::regex_traits<char> >> > >> >::_M_disjunction()+0x30) [0x5607b31e6df0] >> > >> 22: (std::__detail::_Compiler<std::__cxx11::regex_traits<char> >> > >> >::_Compiler(char const*, char const*, std::locale const&, >> > >> std::regex_constants::syn> >> > >> 23: /lib64/libcephsqlite.so(+0x1b7ca) [0x7f4a9d8ba7ca] >> > >> 24: /lib64/libcephsqlite.so(+0x24486) [0x7f4a9d8c3486] >> > >> 25: /lib64/libsqlite3.so.0(+0x75f1c) [0x7f4a9d600f1c] >> > >> 26: /lib64/libsqlite3.so.0(+0xdd4c9) [0x7f4a9d6684c9] >> > >> 27: pysqlite_connection_init() >> > >> 28: /lib64/libpython3.6m.so.1.0(+0x13afc6) [0x7f4a9d182fc6] >> > >> 29: PyObject_Call() >> > >> 30: >> > >> /lib64/python3.6/lib-dynload/_ >> sqlite3.cpython-36m-x86_64-linux-gnu.so >> > (+0xa1f5) >> > >> [0x7f4a8bdf31f5] >> > >> 31: /lib64/libpython3.6m.so.1.0(+0x19d5f1) [0x7f4a9d1e55f1] >> > >> NOTE: a copy of the executable, or `objdump -rdS <executable>` is >> > >> needed to interpret this. >> > >> >> > >> Is there anything I can do to recover from this? Is there anything I >> can >> > >> add to help debugging this? >> > >> >> > >> Thank you, >> > >> >> > >> Daniel >> > >> -- >> > >> Daniel Schreiber >> > >> Facharbeitsgruppe Systemsoftware >> > >> Universitaetsrechenzentrum >> > >> >> > >> Technische Universität Chemnitz >> > >> Straße der Nationen 62 (Raum B303) >> > >> 09111 Chemnitz >> > >> Germany >> > >> >> > >> Tel: +49 371 531 35444 >> > >> Fax: +49 371 531 835444 >> > >> _______________________________________________ >> > >> ceph-users mailing list -- ceph-users@xxxxxxx >> > >> To unsubscribe send an email to ceph-users-leave@xxxxxxx >> > > >> > >> > -- >> > Daniel Schreiber >> > Facharbeitsgruppe Systemsoftware >> > Universitaetsrechenzentrum >> > >> > Technische Universität Chemnitz >> > Straße der Nationen 62 (Raum B303) >> > 09111 Chemnitz >> > Germany >> > >> > Tel: +49 371 531 35444 >> > Fax: +49 371 531 835444 >> > _______________________________________________ >> > ceph-users mailing list -- ceph-users@xxxxxxx >> > To unsubscribe send an email to ceph-users-leave@xxxxxxx >> > >> _______________________________________________ >> ceph-users mailing list -- ceph-users@xxxxxxx >> To unsubscribe send an email to ceph-users-leave@xxxxxxx >> > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx