Hi, sometimes the NetBSD regression tests hang with messages like this: [12:29:07] ./tests/basic/mgmt_v3-locks.t ........................................... ok 79867 ms No volumes present mount_nfs: can't access /patchy: Permission denied mount_nfs: can't access /patchy: Permission denied mount_nfs: can't access /patchy: Permission denied Most (if not all) of these hangs are caused by a crashing Gluster/NFS process. Once the Gluster/NFS server is not reachable anymore, unmounting fails. The only way to recover is to reboot the VM and retrigger the test. For rebooting, the http://build.gluster.org/job/reboot-vm job can be used, and retriggering works by clicking the "retrigger" link in the left menu once the test has been marked as failed/aborted. When logging in on the NetBSD system that hangs, you can verify with these steps: 1. check if there is a /glusterfsd.core file 2. run gdb on the core: # cd /build/install # gdb --core=/glusterfsd.core sbin/glusterfs ... Program terminated with signal SIGSEGV, Segmentation fault. #0 0xb9b94f0b in auth_cache_lookup (cache=0xb9aa2310, fh=0xb9044bf8, host_addr=0xb900e400 "104.130.205.187", timestamp=0xbf7fd900, can_write=0xbf7fd8fc) at /home/jenkins/root/workspace/rackspace-netbsd7-regression-triggered/xlators/nfs/server/src/auth-cache.c:164 164 *can_write = lookup_res->item->opts->rw; 3. verify the lookup_res structure: (gdb) p *lookup_res $1 = {timestamp = 1434284981, item = 0xb901e3b0} (gdb) p *lookup_res->item $2 = {name = 0xffffff00 <error: Cannot access memory at address 0xffffff00>, opts = 0xffffffff} A fix for this has been sent, it is currently waiting for an update to the prosed reference counting: - http://review.gluster.org/11022 core: add "gf_ref_t" for common refcounting structures - http://review.gluster.org/11023 nfs: refcount each auth_cache_entry and related data_t Thanks, Niels _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel