Hi, I'm writing this because I'm currently implementing memory tuning in Rook. The implementation is mostly based on the following POD properties: * memory.limit: defines a hard cap for the memory, if the container tries to allocate more memory than the specified limit it gets terminated. * memory.request: used for scheduling only (and for OOM strategy when applying QoS) If memory.requests is omitted for a container, it defaults to limits. If memory.limits is not set, it defaults to 0 (unbounded). If none of these 2 are specified then we don't tune anything because we don't really know what to do. So far I've collected a couple of Ceph flags that are worth tuning: * mds_cache_memory_limit * osd_memory_target These flags will be passed at instantiation time for the MDS and the OSD daemon. Since most of the daemons have some cache flag, it'll be nice to unify them with a new option --{daemon}-memory-target. Currently I'm exposing POD properties via env var too that Ceph can consume later for more autotuning (POD_{MEMORY,CPU}_LIMIT, POD_{CPU,MEMORY}_REQUEST. One other cool thing will be to report (when containerized) that the daemon cgroup memory limit is closed, so send something on "ceph -s" or ceph could re-adjust some of its internal values. As part, of that PR I'm also implementing failures based on memory.limit per daemon. So I need to know what's the minimum amount of memory we want to recommend in production. It's not an easy thing to do but we have to start somewhere. Thanks! ––––––––– Sébastien Han Principal Software Engineer, Storage Architect "Always give 100%. Unless you're giving blood."