Ceph program memory usage

Bryan Henderson <bryanh@xxxxxxxxxxxxxxxx> · 29 Apr 2017 21:55:17 +0000

A few months ago, I posted here asking why the Ceph program takes so much
memory (virtual, real, and address space) for what seems to be a simple task.
Nobody knew, but I have done extensive research and I have the answer now, and
thought I would publish it here.

All it takes to do a Ceph "status" command is to create a TCP connection to
the monitor, do a small login handshake, send a JSON document that says
"status command", and receive and print the text response.  This could be done
in 64K with maybe a few megabytes of additional address space for the shared
C library.

If you do it with a 'ceph status' command, though, in the Hammer release it
has a 700M peak address space usage (though it varies a lot from one run to
the next) and uses 60M of real memory.

The reason for this is that the Ceph program uses facilities in the librados
library that are meant for much more than just performing a command.  These
facilities are meant to be used by a full-blown server that is a Ceph client.
The facilities deal with locating a monitor within a cluster and failing over
when that monitor dies; they interpret the Ceph configuration file and adjust
dynamically when that file changes; they do logging; and more.  When you type
'ceph status', you are building a sophisticated command-issuing machine,
having it issue one command, and then tearing it down.

'ceph creates about 20 threads.  They are asynchronous enough that in some
runs, multiple threads exist at the same time and in other ones, they exist
serially.  This is why peak memory usage varies from one run to the next.  In
its quiescent state, ready to perform a command, the program has 13 threads
standing by for various purposes.  Each of these has 8M of virtual memory
reserved for its stack and most have 64M for a heap.

Finally, there is a lock auditor facility ("lockdep") that watches for locks
being acquired out of order, as evidence of a bug in the code.  This facility
is not optional; it is always there.  To keep track of all the locking, it
sets up a 2000x2000 array (2000 being an upper limit on the number of locks
the program might contain).  That's 32M of real memory.  I read in a forum
that this has been greatly reduced in later releases.

I was able to reduce the usage to 130M address space and 9M of real memory,
while still using most of the same librados code to do the work, by creating a
stripped-down version of the librados 'MonClient' class, setting the maximum
thread stack size to 1M with rlimits, and making the threads share a heap with
the MALLOC_ARENA_MAX environment variable.  I also disabled lockdep.

I just thought this might be interesting to someone searching the archives for
memory usage information.

-- 
Bryan Henderson                                   San Jose, California
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com