RE: bufferlist allocation optimization ideas

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is pretty much low-level approach, what I was actually wondering is whether we can reduce amount of memory (de)allocations on higher level, like improving the message lifecycle logic (from receiving to performing actual operation and finishing it), so it wouldn't involve so many allocations and deallocations. Reducing memory allocation on low level will help, no doubts about this, but we can probably improve on higher level and don't risk breaking more than we need.


With best regards / Pozdrawiam
Piotr Dałek


> -----Original Message-----
> From: ceph-devel-owner@xxxxxxxxxxxxxxx [mailto:ceph-devel-
> owner@xxxxxxxxxxxxxxx] On Behalf Of Sage Weil
> Sent: Monday, August 10, 2015 9:20 PM
> To: ceph-devel@xxxxxxxxxxxxxxx
> Subject: bufferlist allocation optimization ideas
> 
> Currently putting something in a bufferlist invovles 3 allocations:
> 
>  1. raw buffer (posix_memalign, or new char[])    2. buffer::rawÂ(this holds
> the refcount.  lifecycle matches the
>     raw buffer exactly)
>  Â  3. bufferlist's STL list<> node, which embeds buffer::ptr
> 
> --- combine buffer and buffer::raw ---
> 
> This should be a pretty simple patch, and turns 2 allocations into one.  Most
> buffers are constructed/allocated via buffer::create_*() methods.  Those
> each look something like
> 
>   buffer::raw* buffer::create(unsigned len) {
>     return new raw_char(len);
>   }
> 
> where raw_char::raw_char() allocates the actual buffer.  Instead, allocate
> sizeof(raw_char_combined) + len, and use the right magic C++ syntax to call
> the constructor on that memory.  Something like
> 
>   raw_char_combined *foo = new (ptr) raw_char_combined(ptr);
> 
> where the raw_char_combined constructor is smart enough to figure out
> that data goes at ptr + sizeof(*this).
> 
> That takes us from 3 -> 2 allocations.
> 
> An open question is whether this is always a good idea, or whether there are
> cases where 2 allocates are better, e.g. when len is exactly one page, and
> we're better off with a mempool allocation for raw and page separately.  Or
> maybe for very large buffers?  I'm really not sure what would be better...
> 
> 
> --- make bufferlist use boost::intrusive::list ---
> 
> Most buffers exist in only one list, so the indirection through the ptr is mostly
> wasted.
> 
> 1. embed a boost::intrustive::list node into buffer::ptr.  (Note that doing just
> this buys us nothing... we are just allocating ptr's and using the intrusive node
> instead the list<> node with an embedded ptr.)
> 
> 2. embed a ptr in buffer::raw (or raw_char_combined)
> 
> When adding a buffer to the bufferlist, we use the raw_char_combined's
> embedded ptr if it is available.  Otherwise, we allocate one as before.
> 
> This would need some careful adjustment of hte common append() paths,
> since they currently are all ptr-based.  One way to make this work well might
> be to embed N ptr's in raw_char_combined, on the assumption that the
> refcount for a buffer is never more than 2 or 3.  Only in extreme cases will we
> need to explicitly allocate ptr's.
> 
> 
> Thoughts?
> sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux