concurrent "gluster volume status" crashes the command (v3.4 and v3.7)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dear list,

concurrent running "gluster volume status" on all 3 GlusterFS Nodes (actually those are LXC) somehow crashes the command. Two nodes reply "Another transaction is in progress. Please try again after sometime." and on the 3rd node the command hangs forever. Stopping the hanging command and running it again results also in "Another transaction is in progress. Please try again after sometime." on that machine.

strace exits like:

[...]
connect(7, {sa_family=AF_LOCAL, sun_path="/var/run/gluster/quotad.socket"}, 110) = -1 ENOENT (No such file or directory)
fcntl(7, F_GETFL)                       = 0x802 (flags O_RDWR|O_NONBLOCK)
fcntl(7, F_SETFL, O_RDWR|O_NONBLOCK)    = 0
epoll_ctl(3, EPOLL_CTL_ADD, 7, {EPOLLIN|EPOLLPRI|EPOLLOUT|EPOLLONESHOT, {u32=1, u64=4294967297}}) = 0
pipe([8, 9])                            = 0
fcntl(9, F_SETFD, FD_CLOEXEC)           = 0
pipe([10, 11])                          = 0
fcntl(10, F_GETFL)                      = 0 (flags O_RDONLY)
fstat(10, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f67780e5000
lseek(10, 0, SEEK_CUR)                  = -1 ESPIPE (Illegal seek)
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f67780d9a50) = 28493
close(-1)                               = -1 EBADF (Bad file descriptor)
close(11)                               = 0
close(-1)                               = -1 EBADF (Bad file descriptor)
close(9)                                = 0
read(8, "", 4)                          = 0
close(8)                                = 0
read(10, "gsyncd.py 0.0.1\n", 4096)     = 16
wait4(28493, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 28493
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=28493, si_status=0, si_utime=5, si_stime=1} ---
close(10)                               = 0
munmap(0x7f67780e5000, 4096)            = 0
close(-1)                               = -1 EBADF (Bad file descriptor)
close(-2)                               = -1 EBADF (Bad file descriptor)
close(-1)                               = -1 EBADF (Bad file descriptor)
mmap(NULL, 8392704, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7f6773545000
mprotect(0x7f6773545000, 4096, PROT_NONE) = 0
clone(child_stack=0x7f6773d44f70, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7f6773d459d0, tls=0x7f6773d45700, child_tidptr=0x7f6773d459d0) = 28496
mmap(NULL, 8392704, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7f6772d44000
mprotect(0x7f6772d44000, 4096, PROT_NONE) = 0
clone(child_stack=0x7f6773543f70, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7f67735449d0, tls=0x7f6773544700, child_tidptr=0x7f67735449d0) = 28497
futex(0x7f67735449d0, FUTEX_WAIT, 28497, NULLAnother transaction is in progress. Please try again after sometime.
 <unfinished ...>
+++ exited with 1 +++

I  had to stop all volumes and restart glusterd to solve that problem.

Host OS: Ubuntu 14.04 LTS
LXC OS:  Ubuntu 14.04 LTS


We've got this issue with 3.4.2 (Ubuntu official) and upgraded to 3.7.5 (Launchpad) to check if the problem still exists. Still unsolved. Any ideas?

Thank you for your help,
Florian
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users



[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux