Hi, After our past two days of investigations, this is no longer a new/fresh bug :) The cause for is double unref of fd introduced in 3.8.5 with [1] We have thoroughly investigated on this, and the fix [2] is likely to be coming in the next gluster update. [1] http://review.gluster.org/#/c/15585 [2] http://review.gluster.org/#/c/15768/ -- Prasanna On Thu, Nov 3, 2016 at 4:34 PM, Radu Radutiu <rradutiu@xxxxxxxxx> wrote: > Hi, > > After updating glusterfs server to 3.8.5 (from Centos-gluster-3.8.repo) the > KVM virtual machines (qemu-kvm-ev-2.3.0-31) that access storage using > libgfapi are no longer able to start. The libvirt log file shows: > > [2016-11-02 14:26:41.864024] I [MSGID: 104045] [glfs-master.c:91:notify] > 0-gfapi: New graph 73332d32-3937-3130-2d32-3031362d3131 (0) coming up > [2016-11-02 14:26:41.864075] I [MSGID: 114020] [client.c:2356:notify] > 0-testvol-client-0: parent translators are ready, attempting connect on > transport > [2016-11-02 14:26:41.882975] I [rpc-clnt.c:1947:rpc_clnt_reconfig] > 0-testvol-client-0: changing port to 49152 (from 0) > [2016-11-02 14:26:41.889362] I [MSGID: 114057] > [client-handshake.c:1446:select_server_supported_programs] > 0-testvol-client-0: Using Program GlusterFS 3.3, Num (1298437), Version > (330) > [2016-11-02 14:26:41.890001] I [MSGID: 114046] > [client-handshake.c:1222:client_setvolume_cbk] 0-testvol-client-0: Connected > to testvol-client-0, attached to remote volume '/data/brick1/testvol'. > [2016-11-02 14:26:41.890035] I [MSGID: 114047] > [client-handshake.c:1233:client_setvolume_cbk] 0-testvol-client-0: Server > and Client lk-version numbers are not same, reopening the fds > [2016-11-02 14:26:41.917990] I [MSGID: 114035] > [client-handshake.c:201:client_set_lk_version_cbk] 0-testvol-client-0: > Server lk version = 1 > [2016-11-02 14:26:41.919289] I [MSGID: 104041] > [glfs-resolve.c:885:__glfs_active_subvol] 0-testvol: switched to graph > 73332d32-3937-3130-2d32-3031362d3131 (0) > [2016-11-02 14:26:41.922174] I [MSGID: 114021] [client.c:2365:notify] > 0-testvol-client-0: current graph is no longer active, destroying rpc_client > [2016-11-02 14:26:41.922269] I [MSGID: 114018] > [client.c:2280:client_rpc_notify] 0-testvol-client-0: disconnected from > testvol-client-0. Client process will keep trying to connect to glusterd > until brick's port is available > [2016-11-02 14:26:41.922592] I [MSGID: 101053] > [mem-pool.c:617:mem_pool_destroy] 0-gfapi: size=84 max=1 total=1 > [2016-11-02 14:26:41.923044] I [MSGID: 101053] > [mem-pool.c:617:mem_pool_destroy] 0-gfapi: size=188 max=2 total=2 > [2016-11-02 14:26:41.923419] I [MSGID: 101053] > [mem-pool.c:617:mem_pool_destroy] 0-gfapi: size=140 max=2 total=2 > [2016-11-02 14:26:41.923442] I [MSGID: 101053] > [mem-pool.c:617:mem_pool_destroy] 0-testvol-client-0: size=1324 max=2 > total=5 > [2016-11-02 14:26:41.923458] I [MSGID: 101053] > [mem-pool.c:617:mem_pool_destroy] 0-testvol-dht: size=1148 max=0 total=0 > [2016-11-02 14:26:41.923546] I [MSGID: 101053] > [mem-pool.c:617:mem_pool_destroy] 0-testvol-dht: size=3380 max=2 total=5 > [2016-11-02 14:26:41.923815] I [MSGID: 101053] > [mem-pool.c:617:mem_pool_destroy] 0-testvol-read-ahead: size=188 max=0 > total=0 > [2016-11-02 14:26:41.923832] I [MSGID: 101053] > [mem-pool.c:617:mem_pool_destroy] 0-testvol-readdir-ahead: size=60 max=0 > total=0 > [2016-11-02 14:26:41.923844] I [MSGID: 101053] > [mem-pool.c:617:mem_pool_destroy] 0-testvol-io-cache: size=68 max=0 total=0 > [2016-11-02 14:26:41.923856] I [MSGID: 101053] > [mem-pool.c:617:mem_pool_destroy] 0-testvol-io-cache: size=252 max=1 total=3 > [2016-11-02 14:26:41.923877] I [io-stats.c:3747:fini] 0-testvol: io-stats > translator unloaded > [2016-11-02 14:26:41.924191] I [MSGID: 101191] > [event-epoll.c:659:event_dispatch_epoll_worker] 0-epoll: Exited thread with > index 2 > [2016-11-02 14:26:41.924232] I [MSGID: 101191] > [event-epoll.c:659:event_dispatch_epoll_worker] 0-epoll: Exited thread with > index 1 > 2016-11-02T14:26:42.825041Z qemu-kvm: -drive > file=gluster://s3/testvol/c7.img,if=none,id=drive-virtio-disk0,format=qcow2,cache=none: > Could not read L1 table: Bad file descriptor > > The brick is available , runs on the same host and mounted in another > directory using fuse (to confirm that it is indeed fine). > If I downgrade the gluster server to 3.8.4 everything works fine. Anyone has > seen this or has any idea how to debug? > > Regards, > Radu > > _______________________________________________ > Gluster-users mailing list > Gluster-users@xxxxxxxxxxx > http://www.gluster.org/mailman/listinfo/gluster-users _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users