Hi all, I have sporadic VM going down which files are on gluster FS. If I look at the gluster logs the only events I find are: /var/log/glusterfs/bricks/data-brick2-brick.log [2017-05-08 09:51:17.661697] I [MSGID: 115036] [server.c:548:server_rpc_notify] 0-datastore2-server: disconnecting connection from srvpve2-9074-2017/05/04-14:12:53:301448-datastore2-client-0-0-0 [2017-05-08 09:51:17.661697] I [MSGID: 115036] [server.c:548:server_rpc_notify] 0-datastore2-server: disconnecting connection from srvpve2-9074-2017/05/04-14:12:53:367950-datastore2-client-0-0-0 [2017-05-08 09:51:17.661810] W [inodelk.c:399:pl_inodelk_log_cleanup] 0-datastore2-server: releasing lock on 66d9eefb-ee55-40ad-9f44-c55d1e809006 held by {client=0x7f4c7c004880, pid=0 lk-owner=5c7099efc97f0000} [2017-05-08 09:51:17.661810] W [inodelk.c:399:pl_inodelk_log_cleanup] 0-datastore2-server: releasing lock on a8d82b3d-1cf9-45cf-9858-d8546710b49c held by {client=0x7f4c840f31d0, pid=0 lk-owner=5c7019fac97f0000} [2017-05-08 09:51:17.661835] I [MSGID: 115013] [server-helpers.c:293:do_fd_cleanup] 0-datastore2-server: fd cleanup on /images/201/vm-201-disk-2.qcow2 [2017-05-08 09:51:17.661838] I [MSGID: 115013] [server-helpers.c:293:do_fd_cleanup] 0-datastore2-server: fd cleanup on /images/201/vm-201-disk-1.qcow2 [2017-05-08 09:51:17.661953] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-datastore2-server: Shutting down connection srvpve2-9074-2017/05/04-14:12:53:301448-datastore2-client-0-0-0 [2017-05-08 09:51:17.661953] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-datastore2-server: Shutting down connection srvpve2-9074-2017/05/04-14:12:53:367950-datastore2-client-0-0-0 [2017-05-08 10:01:06.210392] I [MSGID: 115029] [server-handshake.c:692:server_setvolume] 0-datastore2-server: accepted client from srvpve2-162483-2017/05/08-10:01:06:189720-datastore2-client-0-0-0 (version: 3.8.11) [2017-05-08 10:01:06.237433] E [MSGID: 113107] [posix.c:1079:posix_seek] 0-datastore2-posix: seek failed on fd 18 length 42957209600 [No such device or address] [2017-05-08 10:01:06.237463] E [MSGID: 115089] [server-rpc-fops.c:2007:server_seek_cbk] 0-datastore2-server: 18: SEEK-2 (a8d82b3d-1cf9-45cf-9858-d8546710b49c) ==> (No such device or address) [No such device or address] [2017-05-08 10:01:07.019974] I [MSGID: 115029] [server-handshake.c:692:server_setvolume] 0-datastore2-server: accepted client from srvpve2-162483-2017/05/08-10:01:07:3687-datastore2-client-0-0-0 (version: 3.8.11) [2017-05-08 10:01:07.041967] E [MSGID: 113107] [posix.c:1079:posix_seek] 0-datastore2-posix: seek failed on fd 19 length 859136720896 [No such device or address] [2017-05-08 10:01:07.041992] E [MSGID: 115089] [server-rpc-fops.c:2007:server_seek_cbk] 0-datastore2-server: 18: SEEK-2 (66d9eefb-ee55-40ad-9f44-c55d1e809006) ==> (No such device or address) [No such device or address] The strange part is that I cannot seem to find any other error. If I restart the VM everything works as expected (it stopped at ~9.51 UTC and was started at ~10.01 UTC) . This is not the first time that this happened, and I do not see any problems with networking or the hosts. Gluster version is 3.8.11 this is the incriminated volume (though it happened on a different one too) Volume Name: datastore2 Type: Replicate Volume ID: c95ebb5f-6e04-4f09-91b9-bbbe63d83aea Status: Started Snapshot Count: 0 Number of Bricks: 1 x (2 + 1) = 3 Transport-type: tcp Bricks: Brick1: srvpve2g:/data/brick2/brick Brick2: srvpve3g:/data/brick2/brick Brick3: srvpve1g:/data/brick2/brick (arbiter) Options Reconfigured: nfs.disable: on performance.readdir-ahead: on transport.address-family: inet Any hint on how to dig more deeply into the reason would be greatly appreciated. Alessandro _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://lists.gluster.org/mailman/listinfo/gluster-users