Hi, We have a ceph 15.2.7 deployment using cephadm under podman w/ systemd. We've run into what we believe is: https://github.com/ceph/ceph-container/issues/1748 https://tracker.ceph.com/issues/47875 In our case, eventually the mgr container stops emitting output/logging. We are polling with external prometheus clusters, which is likely what triggers the issue, as it appears some amount of time after the container is spawned. Unfortunately, setting limits in the systemd service file for the mgr service on the host OS doesn't work, nor does modifying the unit.run file which is used to start the container under podman to include the --ulimit settings as suggested. Looking inside the container: lib/systemd/system/ceph-mgr@.service:LimitNOFILE=1048576 This prevents us from deploying medium to large ceph clusters, so I would argue it's a high priority bug that should not be closed, unless there is a workaround that works until EPEL 8 contains the fixed version of cheroot and the ceph containers include it. My understanding is this was fixed in cheroot 8.4.0: https://github.com/cherrypy/cheroot/issues/249 https://github.com/cherrypy/cheroot/pull/301 Thank you in advance for any suggestions, David _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx