bufferlist allocation optimization ideas

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Currently putting something in a bufferlist invovles 3 allocations:

 1. raw buffer (posix_memalign, or new char[])
   2. buffer::rawÂ(this holds the refcount.  lifecycle matches the 
    raw buffer exactly)
 Â  3. bufferlist's STL list<> node, which embeds buffer::ptr

--- combine buffer and buffer::raw ---

This should be a pretty simple patch, and turns 2 allocations into 
one.  Most buffers are constructed/allocated via buffer::create_*() 
methods.  Those each look something like

  buffer::raw* buffer::create(unsigned len) {
    return new raw_char(len);
  }

where raw_char::raw_char() allocates the actual buffer.  Instead, allocate 
sizeof(raw_char_combined) + len, and use the right magic C++ syntax to 
call the constructor on that memory.  Something like

  raw_char_combined *foo = new (ptr) raw_char_combined(ptr);

where the raw_char_combined constructor is smart enough to figure out 
that data goes at ptr + sizeof(*this).

That takes us from 3 -> 2 allocations.

An open question is whether this is always a good idea, or whether there 
are cases where 2 allocates are better, e.g. when len is exactly one page, 
and we're better off with a mempool allocation for raw and page 
separately.  Or maybe for very large buffers?  I'm really not sure what 
would be better...


--- make bufferlist use boost::intrusive::list ---

Most buffers exist in only one list, so the indirection through the ptr 
is mostly wasted.

1. embed a boost::intrustive::list node into buffer::ptr.  (Note that 
doing just this buys us nothing... we are just allocating ptr's and using 
the intrusive node instead the list<> node with an embedded ptr.)

2. embed a ptr in buffer::raw (or raw_char_combined)

When adding a buffer to the bufferlist, we use the raw_char_combined's 
embedded ptr if it is available.  Otherwise, we allocate one as before.

This would need some careful adjustment of hte common append() paths, 
since they currently are all ptr-based.  One way to make this work 
well might be to embed N ptr's in raw_char_combined, on the assumption 
that the refcount for a buffer is never more than 2 or 3.  Only in extreme 
cases will we need to explicitly allocate ptr's.


Thoughts?
sage

[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux