On Tue Jan 13 2015 at 11:57:53 PM Mohammed Rafi K C <rkavunga@xxxxxxxxxx> wrote:
On 01/14/2015 12:11 AM, Anand Avati wrote:
3) Why not have a separate iobuf pool for RDMA?
Since every fops are using the default iobuf_pool, if we go with another iobuf_pool dedicated to rdma, we need to copy that buffer from default pool to rdma or so, unless we are intelligently allocating the buffers based on the transport which we are going to use. It is an extra level copying in the I/O path.
Not sure what you mean by that. Every fop does not use default iobuf_pool. Only readv() and writev() do. If you really want to save on memory registration cost, your first target should be the header buffers (which is used in every fop, and currently valloc()ed and ibv_reg_mr() per call). Making headers use an iobuf pool where every arena is registered during arena creation and destruction will get you the highest overhead savings.
Coming to file data iobufs, today iobuf pools are used in a "mixed" way, i.e, they hold both data being actively transferred/under IO, and also data which is being held long term (cached by io-cache). io-cache just does an iobuf_ref() and holds on to the data. This avoids memory copies in io-cache layer. However that may be something we want to reconsider: io-cache could use its own iobuf pool into which data is copied into from the transfer iobuf (which is pre-registered with RDMA in bulk etc.)
Thanks
On Tue Jan 13 2015 at 6:30:09 AM Mohammed Rafi K C <rkavunga@xxxxxxxxxx> wrote:
Hi All,
When using RDMA protocol, we need to register the buffer which is going
to send through rdma with rdma device. In fact, it is a costly
operation, and a performance killer if it happened in I/O path. So our
current plan is to register pre-allocated iobuf_arenas from iobuf_pool
with rdma when rdma is getting initialized. The problem comes when all
the iobufs are exhausted, then we need to dynamically allocate new
arenas from libglusterfs module. Since it is created in libglusterfs, we
can't make a call to rdma from libglusterfs. So we will force to
register each of the iobufs from the newly created arenas with rdma in
I/O path. If io-cache is turned on in client stack, then all the
pre-registred arenas will use by io-cache as cache buffer. so we have to
do the registration in rdma for each i/o call for every iobufs,
eventually we cannot make use of pre registered arenas.
To address the issue, we have two approaches in mind,
1) Register each dynamically created buffers in iobuf by bringing
transport layer together with libglusterfs.
2) create a separate buffer for caching and offload the data from the
read response to the cache buffer in background.
If we could make use of preregister memory for every rdma call, then we
will have approximately 20% increment for write and 25% of increment for
read.
Please give your thoughts to address the issue.
Thanks & Regards
Rafi KC
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel
_______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel