Hello, for about a year and a half I have been supporting a cluster of Ceph for my company (v.15.2.3 on centos 8 which is out of support already) that is used only for S3 and until recently there were no serious problems that I could not deal with of a different nature, but the last problem that appeared about 2 months ago I can not find a solution alone. After adding a firewall for a short time (about 15-20 minutes), each of the hosts was isolated from the monitoring servers, which led to the following error message: ceph> health detail HEALTH_ERR 8 hosts fail cephadm check; failed to probe daemons or devices; Module 'cephadm' has failed: cannot send (already closed?) [WRN] CEPHADM_HOST_CHECK_FAILED: 8 hosts fail cephadm check host mon4 failed check: cannot send (already closed?) host mon5 failed check: cannot send (already closed?) host rgw1 failed check: cannot send (already closed?) host srv1 failed check: cannot send (already closed?) host srv2 failed check: cannot send (already closed?) host srv3 failed check: cannot send (already closed?) host srv4 failed check: cannot send (already closed?) host srv5 failed check: cannot send (already closed?) [WRN] CEPHADM_REFRESH_FAILED: failed to probe daemons or devices host mon4 scrape failed: cannot send (already closed?) host mon4 ceph-volume inventory failed: cannot send (already closed?) host mon5 scrape failed: cannot send (already closed?) host mon5 ceph-volume inventory failed: cannot send (already closed?) host rgw1 scrape failed: cannot send (already closed?) host rgw1 ceph-volume inventory failed: cannot send (already closed?) host srv1 scrape failed: cannot send (already closed?) host srv1 ceph-volume inventory failed: cannot send (already closed?) host srv2 scrape failed: cannot send (already closed?) host srv2 ceph-volume inventory failed: cannot send (already closed?) host srv3 scrape failed: cannot send (already closed?) host srv3 ceph-volume inventory failed: cannot send (already closed?) host srv4 scrape failed: cannot send (already closed?) host srv4 ceph-volume inventory failed: cannot send (already closed?) host srv5 scrape failed: cannot send (already closed?) host srv5 ceph-volume inventory failed: cannot send (already closed?) Despite these errors, the cluster is working and the data is currently being accessed normally. I have not noticed any of the services dropped. Despite the errors, it was necessary to add a new srv6 server, which was normally added to the cluster and worked as expected, but immediately after that another error occurred: [ERR] MGR_MODULE_ERROR: Module 'cephadm' has failed: cannot send (already closed?) Module 'cephadm' has failed: cannot send (already closed?) Which put the cluster in ERROR state. The hosts are alive and connected. #ceph orch host ls HOST ADDR LABELS STATUS adm adm mgr mon1 mon1 mgr mon2 mon2 mon3 mon3 mgr mon4 mon4 mon5 mon5 rgw1 rgw1 rgw2-real rgw2-real srv1 srv1 srv2 srv2 srv3 srv3 srv4 srv4 srv5 192.168.236.215 srv6 192.168.236.216 Any advice is welcome. I read everything that is related to the errors in question and that I was able to find in the different groups, but none of the proposed solutions led to a positive result. Regards, Kalin _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx