On Fri, Apr 24, 2015 at 12:38 PM, Alexandre DERUMIER <aderumier@xxxxxxxxx> wrote: > > Hi, > > I have finished to rebuild ceph with jemalloc, > > all seem to working fine. > > I got a constant 300k iops for the moment, so no speed regression. > > I'll do more long benchmark next week. > > Regards, > > Alexandre In my experience jemalloc is much more proactive at returning memory to the OS, vs. tcmalloc in the default setting is much greedier with keeping/reusing memory. jemalloc tends to do better if you application benefits from a large page cache. Also, jemalloc's aggressive behavior is better if you're running a lot of applications per host because you're less likely to trigger a kernel dirty write out when allocating space (because you're not keeping large free cached around per application). Howard of Symas and LMDB fame did some benchmarking and comparison here: http://symas.com/mdb/inmem/malloc/ He came to somewhat similar conclusions. It would be helpful if you can reproduce the issue with tcmalloc... Turn on tcmalloc stats logging (every 1GB allocated or so), then compare the size to claimed by tcmalloc to process RSS size. If you can account for a large difference, esp. multipled times a number of OSD that may be the culprit. I know things have gotten better in tcmalloc. As in they fixed a few bugs where really large allocations were never returned to the OS and the turned down the default greediness. Sadly, distros have slow at picking these up in the past. If this is a problem it might be worth to have an option to build tcmalloc (using a version know to be good) into Ceph at build time. -- Milosz Tanski CTO 16 East 34th Street, 15th floor New York, NY 10016 p: 646-253-9055 e: milosz@xxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html