Fixed a similar crash in dht_getxattr_cbk here: http://review.gluster.org/#/c/10467/ Susant ----- Original Message ----- From: "Paul Guo" <bigpaulguo@xxxxxxxxxxx> To: gluster-devel@xxxxxxxxxxx Sent: Friday, 8 May, 2015 3:25:01 PM Subject: gluster crashes in dht_getxattr_cbk() due to null pointer dereference. Hi, gdb debugging shows the rootcause seems to be quite straightforward. The gluster version is 3.4.5 and the stack: #0 0x00007eff735fe354 in dht_getxattr_cbk (frame=0x7eff775b6360, cookie=<value optimized out>, this=<value optimized out>, op_ret=<value optimized out>, op_errno=0, xattr=<value optimized out>, xdata=0x0) at dht-common.c:2043 2043 DHT_STACK_UNWIND (getxattr, frame, local->op_ret, op_errno, Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.80.el6.x86_64 keyutils-libs-1.4-4.el6.x86_64 krb5-libs-1.9-33.el6.x86_64 libcom_err-1.41.12-12.el6.x86_64 libgcc-4.4.6-4.el6.x86_64 libselinux-2.0.94-5.3.el6.x86_64 openssl-1.0.1e-16.el6_5.14.x86_64 zlib-1.2.3-27.el6.x86_64 (gdb) bt #0 0x00007eff735fe354 in dht_getxattr_cbk (frame=0x7eff775b6360, cookie=<value optimized out>, this=<value optimized out>, op_ret=<value optimized out>, op_errno=0, xattr=<value optimized out>, xdata=0x0) at dht-common.c:2043 #1 0x00007eff7383c168 in afr_getxattr_cbk (frame=0x7eff7756ab58, cookie=<value optimized out>, this=<value optimized out>, op_ret=0, op_errno=0, dict=0x7eff76f21dc8, xdata=0x0) at afr-inode-read.c:618 #2 0x00007eff73aaaad8 in client3_3_getxattr_cbk (req=<value optimized out>, iov=<value optimized out>, count=<value optimized out>, myframe=0x7eff77554d4c) at client-rpc-fops.c:1115 #3 0x0000003de700d6f5 in rpc_clnt_handle_reply (clnt=0xc36ad0, pollin=0x14b21560) at rpc-clnt.c:771 #4 0x0000003de700ec6f in rpc_clnt_notify (trans=<value optimized out>, mydata=0xc36b00, event=<value optimized out>, data=<value optimized out>) at rpc-clnt.c:891 #5 0x0000003de700a4e8 in rpc_transport_notify (this=<value optimized out>, event=<value optimized out>, data=<value optimized out>) at rpc-transport.c:497 #6 0x00007eff74af6216 in socket_event_poll_in (this=0xc46530) at socket.c:2118 #7 0x00007eff74af7c3d in socket_event_handler (fd=<value optimized out>, idx=<value optimized out>, data=0xc46530, poll_in=1, poll_out=0, poll_err=0) at socket.c:2230 #8 0x0000003de785e907 in event_dispatch_epoll_handler (event_pool=0xb70e90) at event-epoll.c:384 #9 event_dispatch_epoll (event_pool=0xb70e90) at event-epoll.c:445 #10 0x0000000000406818 in main (argc=4, argv=0x7fff24878238) at glusterfsd.c:1934 See dht_getxattr_cbk() (below). When frame->local is equal to 0, gluster jumps to the label "out" where when it accesses local->xattr (i.e. 0->xattr), it crashes. Note in DHT_STACK_UNWIND()->STACK_UNWIND_STRICT(), fn looks fine. (gdb) p __local $11 = (dht_local_t *) 0x0 (gdb) p frame->local $12 = (void *) 0x0 (gdb) p fn $1 = (fop_getxattr_cbk_t) 0x7eff7298c940 <mdc_readv_cbk> I did not read the dht code much so I have not idea whether zero frame->local is normal or not, but from the code's perspective this is an obvious bug and it still exists in latest glusterfs workspace. The following code change is a simple fix, but maybe there's a better one. - if (is_last_call (this_call_cnt)) { + if (is_last_call (this_call_cnt) && local != NULL) { Similar issues exist in other functions also, e.g. stripe_getxattr_cbk() (I did not check all code). int dht_getxattr_cbk (call_frame_t *frame, void *cookie, xlator_t *this, int op_ret, int op_errno, dict_t *xattr, dict_t *xdata) { int this_call_cnt = 0; dht_local_t *local = NULL; VALIDATE_OR_GOTO (frame, out); VALIDATE_OR_GOTO (frame->local, out); ...... out: if (is_last_call (this_call_cnt)) { DHT_STACK_UNWIND (getxattr, frame, local->op_ret, op_errno, local->xattr, NULL); } return 0; } _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel