Re: Gluster 3.6.2 On Xeon Phi

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 02/06/2015 05:31 AM, Rudra Siva wrote:
> Rafi,
>
> Sorry it took me some time - I had to merge these with some of my
> changes - the scif0 (iWARP) does not support SRQ (max_srq : 0) so have
> changed some of the code to use QP instead - can provide those if
> there is interest after this is stable.
>
> Here's the good -
>
> The performance with the patches is better than without (esp.
> http://review.gluster.org/#/c/9327/).
Good to hear. My thought was, http://review.gluster.org/#/c/9506/  will
give a much better performance than the others :-) . A rebase is needed
if it is applying on top the other patches.

>
> The bad - glusterfsd crashes for large files so it's difficult to get
> some decent benchmark numbers 

Thanks for rising the bug. I tried to reproduce the problem on 3.6.2
version+the four patches with a simple distributed volume. But I
couldn't reproduce the same, and still trying. (we are using mellanox ib
cards).

If possible can you please share the volume info and workload used for
large files.


> - small ones look good - trying to
> understand the patch at this time. Looks like this code comes from
> 9327 as well.
>
> Can you please review the reset of mr_count?

Yes, The problem could be the wrong value in mr_count. And I guess we
failed to reset the value to zero, so that for some I/O mr_count will be
incremented couple of times. So the variable might be got overflown. Can
you apply the patch attached with mail, and try with this.

>
> Info from gdb is as follows - if you need more or something jumps out
> please feel free to let me know.
>
> (gdb) p *post
> $16 = {next = 0x7fffe003b280, prev = 0x7fffe0037cc0, mr =
> 0x7fffe0037fb0, buf = 0x7fffe0096000 "\005\004", buf_size = 4096, aux
> = 0 '\000',
>   reused = 1, device = 0x7fffe00019c0, type = GF_RDMA_RECV_POST, ctx =
> {mr = {0x7fffe0003020, 0x7fffc8005f20, 0x7fffc8000aa0, 0x7fffc80030c0,
>       0x7fffc8002d70, 0x7fffc8008bb0, 0x7fffc8008bf0, 0x7fffc8002cd0},
> mr_count = -939493456, vector = {{iov_base = 0x7ffff7fd6000,
>         iov_len = 112}, {iov_base = 0x7fffbf140000, iov_len = 131072},
> {iov_base = 0x0, iov_len = 0} <repeats 14 times>}, count = 2,
>     iobref = 0x7fffc8001670, hdr_iobuf = 0x61d710, is_request = 0
> '\000', gf_rdma_reads = 1, reply_info = 0x0}, refcount = 1, lock = {
>     __data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0,
> __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}},
>     __size = '\000' <repeats 39 times>, __align = 0}}
>
> (gdb) bt
> #0  0x00007fffe7142681 in __gf_rdma_register_local_mr_for_rdma
> (peer=0x7fffe0001800, vector=0x7fffe003b108, count=1,
> ctx=0x7fffe003b0b0)
>     at rdma.c:2255
> #1  0x00007fffe7145acd in gf_rdma_do_reads (peer=0x7fffe0001800,
> post=0x7fffe003b070, readch=0x7fffe0096010) at rdma.c:3609
> #2  0x00007fffe714656e in gf_rdma_recv_request (peer=0x7fffe0001800,
> post=0x7fffe003b070, readch=0x7fffe0096010) at rdma.c:3859
> #3  0x00007fffe714691d in gf_rdma_process_recv (peer=0x7fffe0001800,
> wc=0x7fffceffcd20) at rdma.c:3967
> #4  0x00007fffe7146e7d in gf_rdma_recv_completion_proc
> (data=0x7fffe0002b30) at rdma.c:4114
> #5  0x00007ffff72cfdf3 in start_thread () from /lib64/libpthread.so.0
> #6  0x00007ffff6c403dd in clone () from /lib64/libc.so.6
>
> On Fri, Jan 30, 2015 at 7:11 AM, Mohammed Rafi K C <rkavunga@xxxxxxxxxx> wrote:
>> On 01/29/2015 06:13 PM, Rudra Siva wrote:
>>> Hi,
>>>
>>> Have been able to get Gluster running on Intel's MIC platform. The
>>> only code change to Gluster source was an unresolved yylex (I am not
>>> really sure why that was coming up - may be someone more familiar with
>>> it's use in Gluster can answer).
>>>
>>> At the step for compiling the binaries (glusterd, glusterfsd,
>>> glusterfs, glfsheal)  build breaks with an unresolved yylex error.
>>>
>>> For now have a routine yylex that simply calls graphyylex - I don't
>>> know if this is even correct however mount functions.
>>>
>>> GCC - 4.7 (it's an oddity, latest GCC is missing the Phi patches)
>>>
>>> flex --version
>>> flex 2.5.39
>>>
>>> bison --version
>>> bison (GNU Bison) 3.0
>>>
>>> I'm still working on testing the RDMA and Infiniband support and can
>>> make notes, numbers available when that is complete.
>> There are couple of rdma performance related patches under review. If
>> you could make use of those patches, I hope that will give a performance
>> enhancement.
>>
>> [1] : http://review.gluster.org/#/c/9329/
>> [2] : http://review.gluster.org/#/c/9321/
>> [3] : http://review.gluster.org/#/c/9327/
>> [4] : http://review.gluster.org/#/c/9506/
>>
>> Let me know if you need any clarification.
>>
>> Regards!
>> Rafi KC
>
>

>From 5112911cd2f70d11895e587c28cc4961814283e7 Mon Sep 17 00:00:00 2001
From: Mohammed Rafi KC <rkavunga@xxxxxxxxxx>
Date: Fri, 6 Feb 2015 16:07:29 +0530
Subject: [PATCH] rdma:rdma post variable is not resetting the value.

we are using a gf_rdma_post_t variable to transfer local address
and other information to remote end. Once the I/O is over we put
back this variable in a queue to reuse i for subsequent I/O's.

So the counter variables will get increamented, and we are not
resetting the when we put back the post in queue.

There is chance for overflow of counter variables if we are not
resetting it.

Change-Id: I97505a1020dbf40067c33311b37cd5ddb2633ffd
Signed-off-by: Mohammed Rafi KC <rkavunga@xxxxxxxxxx>
---
 rpc/rpc-transport/rdma/src/rdma.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/rpc/rpc-transport/rdma/src/rdma.c b/rpc/rpc-transport/rdma/src/rdma.c
index a1c8622..668b576 100644
--- a/rpc/rpc-transport/rdma/src/rdma.c
+++ b/rpc/rpc-transport/rdma/src/rdma.c
@@ -64,8 +64,28 @@ gf_rdma_cm_handle_connect_init (struct rdma_cm_event *event);
 static void
 gf_rdma_put_post (gf_rdma_queue_t *queue, gf_rdma_post_t *post)
 {
+        gf_rdma_post_context_t  *ctx = NULL;
+        int                     i    = 0;
+
         post->ctx.is_request = 0;
 
+        ctx = &post->ctx;
+        ctx->mr_count = 0;
+        ctx->count = 0;
+        ctx->iobref = NULL;
+        ctx->hdr_iobuf = NULL;
+        ctx->gf_rdma_reads = 0;
+        ctx->reply_info = NULL;
+
+        for (i = 0; i < GF_RDMA_MAX_SEGMENTS; i++) {
+                ctx->mr[i] = NULL;
+        }
+
+        for (i = 0; i < MAX_IOVEC; i++) {
+                ctx->vector[i].iov_base = NULL;
+                ctx->vector[i].iov_len = 0;
+        }
+
         pthread_mutex_lock (&queue->lock);
         {
                 if (post->prev) {
-- 
1.9.3

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel

[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux