OSD memory (buffer_anon) grows once writing stops

Wido den Hollander <wido@xxxxxxxx> · Thu, 3 Sep 2020 04:22:35 +0200

Hi,

The cluster I'm writing about has a long history (months) of instability 
mainly related to large RocksDB database and high memory consumption.

The use-case is RGW with an EC8+3 pool for data.

In the last months this cluster has been suffering from OSDs using much 
more memory then osd_memory_target and mainly allocated in buffer_anon.

After removing a lot of data from the cluster and re-installing all OSDs 
there is one thing remaining: High memory usage when *NOT* writing data 
to the cluster.

There is a script running which keeps writing data to RADOS in a slow 
pace. Once this stops we are observing the memory usage of the OSDs grow 
steadily and also see the RocksDB databases of the BlueStore OSDs grow.

Once we start to write again memory usage (buffer_annon) reduces.

I think this is related to the pglogs, but even trimming all the pglogs 
does not solve this issue.

Has anybody seen this before or has any clues where to start looking?

Ceph version 14.2.8

Wido
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx