----- Original Message ----- > From: "Xavier Hernandez" <xhernandez@xxxxxxxxxx> > To: "Jeff Darcy" <jdarcy@xxxxxxxxxx> > Cc: "Gluster Devel" <gluster-devel@xxxxxxxxxxx> > Sent: Thursday, April 28, 2016 8:15:36 PM > Subject: Re: Possible bug in the communications layer ? > > > > Hi Jeff, > > On 28.04.2016 15:20, Jeff Darcy wrote: > > > > This happens with Gluster 3.7.11 accessed through Ganesha and gfapi. The > volume is a distributed-disperse 4*(4+2). I'm able to reproduce the problem > easily doing the following test: iozone -t2 -s10g -r1024k -i0 -w > -F/iozone{1..2}.dat echo 3 >/proc/sys/vm/drop_caches iozone -t2 -s10g > -r1024k -i1 -w -F/iozone{1..2}.dat The error happens soon after starting the > read test. As can be seen in the data below, client3_3_readv_cbk() is > processing an iovec of 116 bytes, however it should be of 154 bytes (the > buffer in memory really seems to contain 154 bytes). The data on the network > seems ok (at least I haven't been able to identify any problem), so this > must be a processing error on the client side. The last field in cut buffer > of the sequentialized data corresponds to the length of the xdata field: > 0x26. So at least 38 more byte should be present. > Nice detective work, Xavi. It would be *very* interesting to see what > the value of the "count" parameter is (it's unfortunately optimized out). > I'll bet it's two, and iov[1].iov_len is 38. I have a weak memory of > some problems with how this iov is put together, a couple of years ago, > and it looks like you might have tripped over one more. > It seems you are right. The count is 2 and the first 38 bytes of the second > vector contains the remaining data of xdata field. This is the bug. client3_3_readv_cbk (and for that matter all the actors/cbks) expects response in utmost two vectors: 1. Program header containing request or response. This is subjected to decoding/encoding. This vector should point to a buffer that contains the entire program header/response contiguously. 2. If the procedure returns payload (like readv response or a write request), second vector contains the buffer pointing to the entire (contiguous) payload. Note that this payload is raw and is not subjected to encoding/decoding. In your case, this _clean_ separation is broken with part of program header slipping into 2nd vector supposed to contain read data (may be because of rpc fragmentation). I think this is a bug in socket layer. I'll update more on this. > The rest of the data in > the second vector seems the payload of the readv fop, plus a 2 bytes > padding: > (gdb) f 0 > #0 client3_3_readv_cbk (req=0x7fdc4051a31c, iov=0x7fdc4051a35c, > count=<optimized out>, myframe=0x7fdc520d505c) at client-rpc-fops.c:3021 > 3021 gf_msg (this->name, GF_LOG_ERROR, EINVAL, > (gdb) print *iov > $2 = {iov_base = 0x7fdc14b0d018, iov_len = 116} > (gdb) f 1 > #1 0x00007fdc56dafab0 in rpc_clnt_handle_reply > (clnt=clnt@entry=0x7fdc3c1f4bb0, pollin=pollin@entry=0x7fdc34010f20) at > rpc-clnt.c:764 > 764 req->cbkfn (req, req->rsp, req->rspcnt, saved_frame->frame); > (gdb) print *pollin > $3 = {vector = {{iov_base = 0x7fdc14b0d000, iov_len = 140}, {iov_base = > 0x7fdc14a4d000, iov_len = 32808}, {iov_base = 0x0, iov_len = 0} <repeats 14 > times>}, count = 2, > vectored = 1 '\001', private = 0x7fdc340106c0, iobref = 0x7fdc34006660, > hdr_iobuf = 0x7fdc3c4c07c0, is_reply = 1 '\001'} > (gdb) f 0 > #0 client3_3_readv_cbk (req=0x7fdc4051a31c, iov=0x7fdc4051a35c, > count=<optimized out>, myframe=0x7fdc520d505c) at client-rpc-fops.c:3021 > 3021 gf_msg (this->name, GF_LOG_ERROR, EINVAL, > (gdb) print iov[1] > $4 = {iov_base = 0x7fdc14a4d000, iov_len = 32808} > (gdb) print iov[2] > $5 = {iov_base = 0x2, iov_len = 140583741974112} > (gdb) x/128xb 0x7fdc14a4d000 > 0x7fdc14a4d000: 0x00 0x00 0x00 0x01 0x00 0x00 0x00 0x17 > 0x7fdc14a4d008: 0x00 0x00 0x00 0x02 0x67 0x6c 0x75 0x73 > 0x7fdc14a4d010: 0x74 0x65 0x72 0x66 0x73 0x2e 0x69 0x6e > 0x7fdc14a4d018: 0x6f 0x64 0x65 0x6c 0x6b 0x2d 0x63 0x6f > 0x7fdc14a4d020: 0x75 0x6e 0x74 0x00 0x31 0x00 0x00 0x00 > 0x7fdc14a4d028: 0x5c 0x5c 0x5c 0x5c 0x5c 0x5c 0x5c 0x5c > 0x7fdc14a4d030: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 > 0x7fdc14a4d038: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 > 0x7fdc14a4d040: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 > 0x7fdc14a4d048: 0x5c 0x00 0x00 0x00 0x00 0x00 0x00 0x00 > 0x7fdc14a4d050: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 > 0x7fdc14a4d058: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 > 0x7fdc14a4d060: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 > 0x7fdc14a4d068: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 > 0x7fdc14a4d070: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 > 0x7fdc14a4d078: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 > > > > Maybe it's related to all that epoll stuff. > I'm currently using 4 epoll threads (this improves ec performance). I'll try > to repeat the tests with a single epoll thread, but I'm not sure if this > will be enough to get any conclusion if the problem doesn't manifest, since > ec through fuse with 4 epoll threads doesn't seem to trigger the problem. > Xavi > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@xxxxxxxxxxx > http://www.gluster.org/mailman/listinfo/gluster-devel _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel