On Wed, Oct 3, 2012 at 4:15 PM, Gregory Farnum <greg@xxxxxxxxxxx> wrote: > On Wed, Oct 3, 2012 at 3:22 PM, Tren Blackburn <tren@xxxxxxxxxxxxxxx> wrote: >> Hi List; >> >> I was advised to use the "mds cache size" option to limit the memory >> that the mds process will take. I have it set to "32768". However it >> the ceph-mds process is now at 50GB and still growing. >> >> fern ceph # ps wwaux | grep ceph-mds >> root 895 4.3 26.6 53269304 52725820 ? Ssl Sep28 312:29 >> /usr/bin/ceph-mds -i fern --pid-file /var/run/ceph/mds.fern.pid -c >> /etc/ceph/ceph.conf >> >> Have I specified the limit incorrectly? How far will it go? > > Oof. That looks correct; it sounds like we have a leak or some other > kind of bug. I believe you're on Gentoo; did you build with tcmalloc? > If so, can you run "ceph -w" in one window and then "ceph mds tell 0 > heap stats" and send back the output? > If you didn't build with tcmalloc, can you do so and try again? We > have noticed fragmentation issues with the default memory allocator, > which is why we switched (though I can't imagine it'd balloon that far > — but tcmalloc will give us some better options to diagnose it). Sorry > I didn't mention this before! Hey Greg! Good recall, I am on Gentoo, and I did build with tcmalloc. Here is the information you requested: 2012-10-03 16:20:43.979673 mds.0 [INF] mds.ferntcmalloc heap stats:------------------------------------------------ 2012-10-03 16:20:43.979676 mds.0 [INF] MALLOC: 53796808560 (51304.6 MiB) Bytes in use by application 2012-10-03 16:20:43.979679 mds.0 [INF] MALLOC: + 753664 ( 0.7 MiB) Bytes in page heap freelist 2012-10-03 16:20:43.979681 mds.0 [INF] MALLOC: + 93299048 ( 89.0 MiB) Bytes in central cache freelist 2012-10-03 16:20:43.979683 mds.0 [INF] MALLOC: + 6110720 ( 5.8 MiB) Bytes in transfer cache freelist 2012-10-03 16:20:43.979685 mds.0 [INF] MALLOC: + 84547880 ( 80.6 MiB) Bytes in thread cache freelists 2012-10-03 16:20:43.979686 mds.0 [INF] MALLOC: + 84606976 ( 80.7 MiB) Bytes in malloc metadata 2012-10-03 16:20:43.979688 mds.0 [INF] MALLOC: ------------ 2012-10-03 16:20:43.979690 mds.0 [INF] MALLOC: = 54066126848 (51561.5 MiB) Actual memory used (physical + swap) 2012-10-03 16:20:43.979691 mds.0 [INF] MALLOC: + 0 ( 0.0 MiB) Bytes released to OS (aka unmapped) 2012-10-03 16:20:43.979693 mds.0 [INF] MALLOC: ------------ 2012-10-03 16:20:43.979694 mds.0 [INF] MALLOC: = 54066126848 (51561.5 MiB) Virtual address space used 2012-10-03 16:20:43.979700 mds.0 [INF] MALLOC: 2012-10-03 16:20:43.979702 mds.0 [INF] MALLOC: 609757 Spans in use 2012-10-03 16:20:43.979703 mds.0 [INF] MALLOC: 395 Thread heaps in use 2012-10-03 16:20:43.979705 mds.0 [INF] MALLOC: 8192 Tcmalloc page size 2012-10-03 16:20:43.979710 mds.0 [INF] ------------------------------------------------ 2012-10-03 16:20:43.979716 mds.0 [INF] Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()). 2012-10-03 16:20:43.979718 mds.0 [INF] Bytes released to the It didn't print anything past the "Bytes released to the"... Let me know if you need anything else. t. -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html