So the test fails (intermittently) in check_fs which tries to do a df on the mount point for a volume which is carved out of three bricks from 3 nodes and one node is completely down. A quick look at the mount log reveals the following:
[2016-10-10 13:58:59.279446]:++++++++++ G_LOG:./tests/bugs/glusterd/bug-913555.t: TEST: 48 0 check_fs /mnt/glusterfs/0 ++++++++++
[2016-10-10 13:58:59.287973] W [MSGID: 114031] [client-rpc-fops.c:2930:client3_3_lookup_cbk] 0-patchy-client-2: remote operation failed. Path: / (00000000-0000-0000-0000-000000000001) [Transport endpoint is not connected]
[2016-10-10 13:58:59.288326] I [MSGID: 109063] [dht-layout.c:713:dht_layout_normalize] 0-patchy-dht: Found anomalies in / (gfid = 00000000-0000-0000-0000-000000000001). Holes=1 overlaps=0
[2016-10-10 13:58:59.288352] W [MSGID: 109005] [dht-selfheal.c:2102:dht_selfheal_directory] 0-patchy-dht: Directory selfheal failed: 1 subvolumes down.Not fixing. path = /, gfid =
[2016-10-10 13:58:59.288643] W [MSGID: 114031] [client-rpc-fops.c:2930:client3_3_lookup_cbk] 0-patchy-client-2: remote operation failed. Path: / (00000000-0000-0000-0000-000000000001) [Transport endpoint is not connected]
[2016-10-10 13:58:59.288927] W [fuse-resolve.c:132:fuse_resolve_gfid_cbk] 0-fuse: 00000000-0000-0000-0000- 000000000001: failed to resolve (Stale file handle)
[2016-10-10 13:58:59.288949] W [fuse-bridge.c:2597:fuse_opendir_resume] 0-glusterfs-fuse: 7: OPENDIR (00000000-0000- 0000-0000-000000000001) resolution failed
[2016-10-10 13:58:59.289505] W [fuse-resolve.c:132:fuse_resolve_gfid_cbk] 0-fuse: 00000000-0000-0000-0000- 000000000001: failed to resolve (Stale file handle)
[2016-10-10 13:58:59.289524] W [fuse-bridge.c:3137:fuse_statfs_resume] 0-glusterfs-fuse: 8: STATFS (00000000-0000- 0000-0000-000000000001) resolution fail
DHT team - are these anomalies expected here? I also see opendir and statfs failing here too.[2016-10-10 13:58:59.279446]:++++++++++ G_LOG:./tests/bugs/glusterd/
[2016-10-10 13:58:59.287973] W [MSGID: 114031] [client-rpc-fops.c:2930:
[2016-10-10 13:58:59.288326] I [MSGID: 109063] [dht-layout.c:713:dht_layout_
[2016-10-10 13:58:59.288352] W [MSGID: 109005] [dht-selfheal.c:2102:dht_
[2016-10-10 13:58:59.288643] W [MSGID: 114031] [client-rpc-fops.c:2930:
[2016-10-10 13:58:59.288927] W [fuse-resolve.c:132:fuse_
[2016-10-10 13:58:59.288949] W [fuse-bridge.c:2597:fuse_
[2016-10-10 13:58:59.289505] W [fuse-resolve.c:132:fuse_
[2016-10-10 13:58:59.289524] W [fuse-bridge.c:3137:fuse_
On Wed, Oct 12, 2016 at 12:18 PM, Atin Mukherjee <amukherj@xxxxxxxxxx> wrote:
I will take a look at it in sometime.--On Wed, Oct 12, 2016 at 12:08 PM, Oleksandr Natalenko <oleksandr@xxxxxxxxxxxxxx> wrote:Hello.
Vijay asked me to drop a note about spurious failure of ./tests/bugs/glusterd/bug-913555.t test. Here are the examples:
* https://build.gluster.org/job/centos6-regression/1069/consol eFull
* https://build.gluster.org/job/centos6-regression/1076/consol eFull
Could someone take a look at it?
Also, last two tests were broken because of this:
===
Slave went offline during the build
===
See these builds for details:
* https://build.gluster.org/job/centos6-regression/1077/consol eFull
* https://build.gluster.org/job/centos6-regression/1078/consol eFull
Was that intentionally?
Thanks.
Regards,
Oleksandr
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel
--Atin
--
_______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel