Hi Sage, On 06/29/2017 03:25 PM, Sage Weil wrote: > On Thu, 29 Jun 2017, Mohamad Gebai wrote: >> If the page-aligned allocations are large, and if they are sparse (ie. >> there are random smaller non-page-aligned allocations in between), the >> heap is much less fragmented, and it won't seem like the memory is >> wasted. Does that seem like a reasonable hypothesis or did I completely >> misunderstand the bug report? > This all sounds right. > > The problem is that it is common and expected for bluestore to ask for a > 4kb page-aligned buffer. There is the 4kb aligned allocation for the > buffer itself, and there is the small buffer::raw tracking struct > with the ref count and so on. This should end up consuming 4kb + a little > bit, not 8kb. Right, 4kb for the data and a few extra bytes for the rest. So in total, two pages are touched and accounted for the process for each page-aligned allocation. > First, it would be good to confirm the allocator actually does behave this > way. (Ick.) I was able to reproduce this quite easily outside of Ceph, if you're interested the code is here: https://github.com/mogeb/utils/tree/master/mempool. This is simply a standalone version of the attachment in the tracker. The output of the program is as follows: Mem before2: VmRSS: 10900 kB Mem after2: VmRSS: 8399680 kB Mem actually used: 8590110720 bytes Mem that should be used: 4294967296 bytes Difference: 4295143424 bytes, 4.00016 gb Also, preloading libtcmalloc makes this behavior disappear (at least for this program), which confirms further the hypothesis, since tcmalloc does larger allocations internally. Mohamad -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html