Thanks xavi. @Prasanna/Rafi/Rtalur, We had earlier seen "decoding failed" errors in quotad in another user setup. Possibly these are related. Can you please take a look? regards, Raghavendra ----- Original Message ----- > From: "Xavier Hernandez" <xhernandez@xxxxxxxxxx> > To: "Gluster Devel" <gluster-devel@xxxxxxxxxxx> > Sent: Thursday, April 28, 2016 6:13:01 PM > Subject: Possible bug in the communications layer ? > > Hi, > > I've seen what seems a bug in the communications layer. The first sign > is an "XDR decoding failed" error in the logs. > > This happens with Gluster 3.7.11 accessed through Ganesha and gfapi. The > volume is a distributed-disperse 4*(4+2). > > I'm able to reproduce the problem easily doing the following test: > > iozone -t2 -s10g -r1024k -i0 -w -F <nfs mount>/iozone{1..2}.dat > echo 3 >/proc/sys/vm/drop_caches > iozone -t2 -s10g -r1024k -i1 -w -F <nfs mount>/iozone{1..2}.dat > > The error happens soon after starting the read test. > > As can be seen in the data below, client3_3_readv_cbk() is processing an > iovec of 116 bytes, however it should be of 154 bytes (the buffer in > memory really seems to contain 154 bytes). The data on the network seems > ok (at least I haven't been able to identify any problem), so this must > be a processing error on the client side. > > The last field in cut buffer of the sequentialized data corresponds to > the length of the xdata field: 0x26. So at least 38 more byte should be > present. > > My guess is that some corner case is hit reading fragmented network > packets due to a high load. > > Debug information: > > Breakpoint 1, client3_3_readv_cbk (req=0x7f540e64106c, > iov=0x7f540e6410ac, count=<optimized out>, myframe=0x7f54259a4d54) at > client-rpc-fops.c:3021 > 3021 gf_msg (this->name, GF_LOG_ERROR, EINVAL, > (gdb) print *iov > $1 = {iov_base = 0x7f53e994e018, iov_len = 116} > (gdb) x/116xb 0x7f53e994e018 > 0x7f53e994e018: 0x00 0x00 0x80 0x00 0x00 0x00 0x00 0x00 > 0x7f53e994e020: 0xa8 0xbf 0xa3 0xe0 0x5f 0x48 0x4c 0x1e > 0x7f53e994e028: 0x80 0xa3 0x8a 0xd8 0x9d 0xa1 0x1c 0x75 > 0x7f53e994e030: 0x80 0xa3 0x8a 0xd8 0x9d 0xa1 0x1c 0x75 > 0x7f53e994e038: 0x00 0x00 0x00 0x00 0x00 0x00 0x08 0x00 > 0x7f53e994e040: 0x00 0x00 0x81 0xa0 0x00 0x00 0x00 0x01 > 0x7f53e994e048: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 > 0x7f53e994e050: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 > 0x7f53e994e058: 0x00 0x00 0x00 0x00 0xa0 0x00 0x00 0x00 > 0x7f53e994e060: 0x00 0x00 0x10 0x00 0x00 0x00 0x00 0x00 > 0x7f53e994e068: 0x00 0x50 0x00 0x00 0x57 0x22 0x04 0x1f > 0x7f53e994e070: 0x25 0x38 0x92 0x91 0x57 0x22 0x04 0xb3 > 0x7f53e994e078: 0x03 0x53 0x1b 0x13 0x57 0x22 0x04 0xb3 > 0x7f53e994e080: 0x06 0xf5 0xe1 0x99 0x00 0x00 0x80 0x00 > 0x7f53e994e088: 0x00 0x00 0x00 0x26 > (gdb) x/154xb 0x7f53e994e018 > 0x7f53e994e018: 0x00 0x00 0x80 0x00 0x00 0x00 0x00 0x00 > 0x7f53e994e020: 0xa8 0xbf 0xa3 0xe0 0x5f 0x48 0x4c 0x1e > 0x7f53e994e028: 0x80 0xa3 0x8a 0xd8 0x9d 0xa1 0x1c 0x75 > 0x7f53e994e030: 0x80 0xa3 0x8a 0xd8 0x9d 0xa1 0x1c 0x75 > 0x7f53e994e038: 0x00 0x00 0x00 0x00 0x00 0x00 0x08 0x00 > 0x7f53e994e040: 0x00 0x00 0x81 0xa0 0x00 0x00 0x00 0x01 > 0x7f53e994e048: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 > 0x7f53e994e050: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 > 0x7f53e994e058: 0x00 0x00 0x00 0x00 0xa0 0x00 0x00 0x00 > 0x7f53e994e060: 0x00 0x00 0x10 0x00 0x00 0x00 0x00 0x00 > 0x7f53e994e068: 0x00 0x50 0x00 0x00 0x57 0x22 0x04 0x1f > 0x7f53e994e070: 0x25 0x38 0x92 0x91 0x57 0x22 0x04 0xb3 > 0x7f53e994e078: 0x03 0x53 0x1b 0x13 0x57 0x22 0x04 0xb3 > 0x7f53e994e080: 0x06 0xf5 0xe1 0x99 0x00 0x00 0x80 0x00 > 0x7f53e994e088: 0x00 0x00 0x00 0x26 0x00 0x00 0x00 0x01 > 0x7f53e994e090: 0x00 0x00 0x00 0x17 0x00 0x00 0x00 0x02 > 0x7f53e994e098: 0x67 0x6c 0x75 0x73 0x74 0x65 0x72 0x66 > 0x7f53e994e0a0: 0x73 0x2e 0x69 0x6e 0x6f 0x64 0x65 0x6c > 0x7f53e994e0a8: 0x6b 0x2d 0x63 0x6f 0x75 0x6e 0x74 0x00 > 0x7f53e994e0b0: 0x31 0x00 > (gdb) bt > #0 client3_3_readv_cbk (req=0x7f540e64106c, iov=0x7f540e6410ac, > count=<optimized out>, myframe=0x7f54259a4d54) at client-rpc-fops.c:3021 > #1 0x00007f542a677ab0 in rpc_clnt_handle_reply > (clnt=clnt@entry=0x7f54101cdef0, pollin=pollin@entry=0x7f54000091f0) at > rpc-clnt.c:764 > #2 0x00007f542a677d6f in rpc_clnt_notify (trans=<optimized out>, > mydata=0x7f54101cdf20, event=<optimized out>, data=0x7f54000091f0) at > rpc-clnt.c:925 > #3 0x00007f542a673853 in rpc_transport_notify > (this=this@entry=0x7f54101ddb70, > event=event@entry=RPC_TRANSPORT_MSG_RECEIVED, > data=data@entry=0x7f54000091f0) at rpc-transport.c:546 > #4 0x00007f541d881666 in socket_event_poll_in > (this=this@entry=0x7f54101ddb70) at socket.c:2237 > #5 0x00007f541d8842c4 in socket_event_handler (fd=fd@entry=30, > idx=idx@entry=20, data=0x7f54101ddb70, poll_in=1, poll_out=0, > poll_err=0) at socket.c:2350 > #6 0x00007f542a90aa4a in event_dispatch_epoll_handler > (event=0x7f540effc540, event_pool=0xd3d9a0) at event-epoll.c:575 > #7 event_dispatch_epoll_worker (data=0x7f54100d0890) at event-epoll.c:678 > #8 0x00007f542d0fbdc5 in start_thread () from /lib64/libpthread.so.0 > #9 0x00007f542c7bc28d in clone () from /lib64/libc.so.6 > > Xavi > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@xxxxxxxxxxx > http://www.gluster.org/mailman/listinfo/gluster-devel > _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel