To kill a zombie process, you have to kill the parent process. ps -p 23744 -o ppid= If the result is 1, then you are stuck rebooting. Otherwise, kill that process. Deleting a filename does not close the named pipe, so that caused the failure below. Joel Young <jdy at cryregarder.com> wrote: >On Tue, Jul 30, 2013 at 10:49 PM, Kaushal M <kshlmster at gmail.com> >wrote: >> I think I've found the problem. The problem is not with the brick >port, but instead with >> the unix domain socket used for communication between glusterd and >glusterfsd. > >Makes sense. > >> So this is most likely due the zombie process 23744 still listening >on the unix >> domain socket. Only one bind can be performed on a unix domain >socket. If >> another bind is tried we get an EADDRINUSE error. >> >> Can you kill 23744, remove >/var/run/5a538b707ce5dbf525ba6d01835863bb.socket >> and restart the brick using 'gluster volume start'. This should allow >it to start. > >It isn't possible to kill 23744 as it is zombie. fuser on the socket >doesn't report any >users. I did remove /var/run/5a53... > >"gluster volume start home" doesn't work as the volume is already >started (and mounted >and in use by users so I'd rather not shutdown the cluster). I tried a >"systemctl restart glusterd.service" which did not restart the brick >but did leave the following >in /var/log/bricks/lhome-gluster_home.log: > >[2013-07-31 16:04:59.716771] I [glusterfsd.c:1910:main] >0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version >3.4.0 (/usr/sbin/glusterfsd -s ir2 --volfile-id >home.ir2.lhome-gluster_home -p >/var/lib/glusterd/vols/home/run/ir2-lhome-gluster_home.pid -S >/var/run/5a538b707ce5dbf525ba6d01835863bb.socket --brick-name >/lhome/gluster_home -l >/var/log/glusterfs/bricks/lhome-gluster_home.log --xlator-option >*-posix.glusterd-uuid=9d2d74bf-9055-47a6-b3df-8c2057ea1dd9 >--brick-port 49157 --xlator-option home-server.listen-port=49157) >[2013-07-31 16:04:59.719901] I [socket.c:3480:socket_init] >0-socket.glusterfsd: SSL support is NOT enabled >[2013-07-31 16:04:59.719936] I [socket.c:3495:socket_init] >0-socket.glusterfsd: using system polling thread >[2013-07-31 16:04:59.720242] I [socket.c:3480:socket_init] >0-glusterfs: SSL support is NOT enabled >[2013-07-31 16:04:59.720256] I [socket.c:3495:socket_init] >0-glusterfs: using system polling thread >[2013-07-31 16:04:59.752491] I [graph.c:239:gf_add_cmdline_options] >0-home-server: adding option 'listen-port' for volume 'home-server' >with value '49157' >[2013-07-31 16:04:59.752514] I [graph.c:239:gf_add_cmdline_options] >0-home-posix: adding option 'glusterd-uuid' for volume 'home-posix' >with value '9d2d74bf-9055-47a6-b3df-8c2057ea1dd9' >[2013-07-31 16:04:59.753960] W [options.c:848:xl_opt_validate] >0-home-server: option 'listen-port' is deprecated, preferred is >'transport.socket.listen-port', continuing with correction >[2013-07-31 16:04:59.754000] I [socket.c:3480:socket_init] >0-tcp.home-server: SSL support is NOT enabled >[2013-07-31 16:04:59.754025] I [socket.c:3495:socket_init] >0-tcp.home-server: using system polling thread >[2013-07-31 16:04:59.754075] E [socket.c:695:__socket_server_bind] >0-tcp.home-server: binding to failed: Address already in use >[2013-07-31 16:04:59.754091] E [socket.c:698:__socket_server_bind] >0-tcp.home-server: Port is already in use >[2013-07-31 16:04:59.754108] W [rpcsvc.c:1394:rpcsvc_transport_create] >0-rpc-service: listening on transport failed >[2013-07-31 16:04:59.754128] W [server.c:1092:init] 0-home-server: >creation of listener failed >[2013-07-31 16:04:59.754140] E [xlator.c:390:xlator_init] >0-home-server: Initialization of volume 'home-server' failed, review >your volfile again >[2013-07-31 16:04:59.754151] E [graph.c:292:glusterfs_graph_init] >0-home-server: initializing translator failed >[2013-07-31 16:04:59.754162] E [graph.c:479:glusterfs_graph_activate] >0-graph: init failed >[2013-07-31 16:04:59.754404] W [glusterfsd.c:1002:cleanup_and_exit] >(-->/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0x90) [0x7f5794b5db10] >(-->/usr/sbin/glusterfsd(mgmt_getspec_cbk+0x2fd) [0x7f5795216bcd] >(-->/usr/sbin/glusterfsd(glusterfs_process_volfp+0x103) >[0x7f5795212603]))) 0-: received signum (0), shutting down > > >Which seems like it worked and then tried again and failed? > >Thanks! > >Joel >_______________________________________________ >Gluster-users mailing list >Gluster-users at gluster.org >http://supercolony.gluster.org/mailman/listinfo/gluster-users -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130731/5d97cb58/attachment.html>