I will provide you any info you need, just gimme a sign. My starter post was related to 19.2.0. Now I downgraded (full reinstall as this is completely new cluster I wanna run) to 18.2.4 and the same story Mar 06 09:37:41 node1.ec.mts conmon[10588]: failed to collect metrics: Mar 06 09:37:41 node1.ec.mts conmon[10588]: Traceback (most recent call last): Mar 06 09:37:41 node1.ec.mts conmon[10588]: File "/usr/share/ceph/mgr/prometheus/module.py", line 514, in collect Mar 06 09:37:41 node1.ec.mts conmon[10588]: data = self.mod.collect() Mar 06 09:37:41 node1.ec.mts conmon[10588]: File "/usr/share/ceph/mgr/mgr_util.py", line 862, in wrapper Mar 06 09:37:41 node1.ec.mts conmon[10588]: result = f(*args, **kwargs) Mar 06 09:37:41 node1.ec.mts conmon[10588]: File "/usr/share/ceph/mgr/prometheus/module.py", line 1719, in collect Mar 06 09:37:41 node1.ec.mts conmon[10588]: self.get_metadata_and_osd_status() Mar 06 09:37:41 node1.ec.mts conmon[10588]: File "/usr/share/ceph/mgr/mgr_util.py", line 862, in wrapper Mar 06 09:37:41 node1.ec.mts conmon[10588]: result = f(*args, **kwargs) Mar 06 09:37:41 node1.ec.mts conmon[10588]: File "/usr/share/ceph/mgr/prometheus/module.py", line 1138, in get_metadata_and_osd_status Mar 06 09:37:41 node1.ec.mts conmon[10588]: osd_map = self.get('osd_map') Mar 06 09:37:41 node1.ec.mts conmon[10588]: File "/usr/share/ceph/mgr/mgr_module.py", line 1401, in get Mar 06 09:37:41 node1.ec.mts conmon[10588]: obj = json.loads(obj) Mar 06 09:37:41 node1.ec.mts conmon[10588]: File "/lib64/python3.9/json/__init__.py", line 346, in loads Mar 06 09:37:41 node1.ec.mts conmon[10588]: return _default_decoder.decode(s) Mar 06 09:37:41 node1.ec.mts conmon[10588]: File "/lib64/python3.9/json/decoder.py", line 337, in decode Mar 06 09:37:41 node1.ec.mts conmon[10588]: obj, end = self.raw_decode(s, idx=_w(s, 0).end()) Mar 06 09:37:41 node1.ec.mts conmon[10588]: File "/lib64/python3.9/json/decoder.py", line 355, in raw_decode Mar 06 09:37:41 node1.ec.mts conmon[10588]: raise JSONDecodeError("Expecting value", s, err.value) from None Mar 06 09:37:41 node1.ec.mts conmon[10588]: json.decoder.JSONDecodeError: Expecting value: line 1 column 2311 (char 2310) A bit more info, probably it helps somehow. This is a cluster out of 6 nodes by 116 OSD each (696 OSD total). When it was 5 nodes - no error, when 6th appeared - error sprang up. Maybe high OSD denstity give the error? _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx