Hi Gluster team & users,
We are seeing multiple instances of the following error: "remote operation failed [No such file or directory]" on our gluster clients, and this has affects cases where we have some files hosted and are opened/memory-mapped
We are seeing this error after we recently added another brick to a replica 2 gluster volume (A couple of days back), making it a volume supported by three replicated bricks (we performed this operation a couple of days ago). Any information on this error would be useful. If needed we can supply any of the client or brick logs.
12447146-[2016-10-21 14:50:07.806214] I [dict.c:473:dict_get] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.15/xlator/debug/io-stats.so(io_stats_lookup_cbk+0x148) [0x7f68a0cc5f68] -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.15/xlator/system/posix-acl.so(posix_acl_lookup_cbk+0x284) [0x7f68a0aada94] -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_get+0xac) [0x7f68a7f30dbc] ) 6-dict: !this || key=system.posix_acl_default [Invalid argument]
12447579-[2016-10-21 14:50:07.837879] I [dict.c:473:dict_get] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.15/xlator/debug/io-stats.so(io_stats_lookup_cbk+0x148) [0x7f68a0cc5f68] -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.15/xlator/system/posix-acl.so(posix_acl_lookup_cbk+0x230) [0x7f68a0aada40] -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_get+0xac) [0x7f68a7f30dbc] ) 6-dict: !this || key=system.posix_acl_access [Invalid argument]
12448011-[2016-10-21 14:50:07.837928] I [dict.c:473:dict_get] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.15/xlator/debug/io-stats.so(io_stats_lookup_cbk+0x148) [0x7f68a0cc5f68] -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.15/xlator/system/posix-acl.so(posix_acl_lookup_cbk+0x284) [0x7f68a0aada94] -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_get+0xac) [0x7f68a7f30dbc] ) 6-dict: !this || key=system.posix_acl_default [Invalid argument]
12448444:[2016-10-21 14:50:10.784317] W [MSGID: 114031] [client-rpc-fops.c:3057:client3_3_readv_cbk] 6-volume1-client-1: remote operation failed [No such file or directory]
12448608:[2016-10-21 14:50:10.784757] W [MSGID: 114031] [client-rpc-fops.c:1572:client3_3_fstat_cbk] 6-volume1-client-0: remote operation failed [No such file or directory]
12448772:[2016-10-21 14:50:10.784763] W [MSGID: 114031] [client-rpc-fops.c:1572:client3_3_fstat_cbk] 6-volume1-client-1: remote operation failed [No such file or directory]
12448936:[2016-10-21 14:50:10.785575] W [MSGID: 114031] [client-rpc-fops.c:1572:client3_3_fstat_cbk] 6-volume1-client-2: remote operation failed [No such file or directory]
12449100-[2016-10-21 14:50:10.786208] W [MSGID: 108008] [afr-read-txn.c:244:afr_read_txn] 6-volume1-replicate-0: Unreadable subvolume -1 found with event generation 3 for gfid 10495074-82d0-4961-8212-5a4f32895f37. (Possible split-brain)
12449328:[2016-10-21 14:50:10.787439] W [MSGID: 114031] [client-rpc-fops.c:1572:client3_3_fstat_cbk] 6-volume1-client-2: remote operation failed [No such file or directory]
12449492-[2016-10-21 14:50:10.788730] E [MSGID: 109040] [dht-helper.c:1190:dht_migration_complete_check_task] 6-volume1-dht: (null): failed to lookup the file on volume1-dht [Stale file handle]
12449677-[2016-10-21 14:50:10.788778] W [fuse-bridge.c:2227:fuse_readv_cbk] 0-glusterfs-fuse: 622070230: READ => -1 gfid=10495074-82d0-4961-8212-5a4f32895f37 fd=0x7f68951a75bc (Stale file handle)
12449864:The message "W [MSGID: 114031] [client-rpc-fops.c:1572:client3_3_fstat_cbk] 6-volume1-client-1: remote operation failed [No such file or directory]" repeated 2 times between [2016-10-21 14:50:10.784763] and [2016-10-21 14:50:10.789213]
12450100:[2016-10-21 14:50:10.790080] W [MSGID: 114031] [client-rpc-fops.c:1572:client3_3_fstat_cbk] 6-volume1-client-2: remote operation failed [No such file or directory]
12450264:The message "W [MSGID: 114031] [client-rpc-fops.c:1572:client3_3_fstat_cbk] 6-volume1-client-0: remote operation failed [No such file or directory]" repeated 3 times between [2016-10-21 14:50:10.784757] and [2016-10-21 14:50:10.791118]
12450500:[2016-10-21 14:50:10.791176] W [MSGID: 114031] [client-rpc-fops.c:1572:client3_3_fstat_cbk] 6-volume1-client-1: remote operation failed [No such file or directory]
12450664-[2016-10-21 14:50:10.793395] W [fuse-bridge.c:2227:fuse_readv_cbk] 0-glusterfs-fuse: 622070238: READ => -1 gfid=10495074-82d0-4961-8212-5a4f32895f37 fd=0x7f68951a75bc (Stale file handle)
12450851-[2016-10-21 14:50:11.036804] I [dict.c:473:dict_get] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.15/xlator/debug/io-stats.so(io_stats_lookup_cbk+0x148) [0x7f68a0cc5f68] -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.15/xlator/system/posix-acl.so(posix_acl_lookup_cbk+0x230) [0x7f68a0aada40] -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_get+0xac) [0x7f68a7f30dbc] ) 6-dict: !this || key=system.posix_acl_access [Invalid argument]
12451283-[2016-10-21 14:50:11.036889] I [dict.c:473:dict_get] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.15/xlator/debug/io-stats.so(io_stats_lookup_cbk+0x148) [0x7f68a0cc5f68] -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.15/xlator/system/posix-acl.so(posix_acl_lookup_cbk+0x284) [0x7f68a0aada94] -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_get+0xac) [0x7f68a7f30dbc] ) 6-dict: !this || key=system.posix_acl_default [Invalid argument]
12451716-The message "W [MSGID: 108008] [afr-read-txn.c:244:afr_read_txn] 6-volume1-replicate-0: Unreadable subvolume -1 found with event generation 3 for gfid 10495074-82d0-4961-8212-5a4f32895f37. (Possible split-brain)" repeated 3 times between [2016-10-21 14:50:10.786208] and [2016-10-21 14:50:11.223498]
12452016:[2016-10-21 14:50:11.223949] W [MSGID: 114031] [client-rpc-fops.c:1572:client3_3_fstat_cbk] 6-volume1-client-0: remote operation failed [No such file or directory]
12452180:The message "W [MSGID: 114031] [client-rpc-fops.c:1572:client3_3_fstat_cbk] 6-volume1-client-2: remote operation failed [No such file or directory]" repeated 2 times between [2016-10-21 14:50:10.790080] and [2016-10-21 14:50:11.224945]
12452416-[2016-10-21 14:50:11.225264] W [MSGID: 108008] [afr-read-txn.c:244:afr_read_txn] 6-volume1-replicate-0: Unreadable subvolume -1 found with event generation 3 for gfid 10495074-82d0-4961-8212-5a4f32895f37. (Possible split-brain)
12452644:The message "W [MSGID: 114031] [client-rpc-fops.c:1572:client3_3_fstat_cbk] 6-volume1-client-1: remote operation failed [No such file or directory]" repeated 2 times between [2016-10-21 14:50:10.791176] and [2016-10-21 14:50:11.225783]
12452880:[2016-10-21 14:50:11.226648] W [MSGID: 114031] [client-rpc-fops.c:1572:client3_3_fstat_cbk] 6-volume1-client-2: remote operation failed [No such file or directory]
12453044-[2016-10-21 14:50:11.228115] W [fuse-bridge.c:2227:fuse_readv_cbk] 0-glusterfs-fuse: 622070413: READ => -1 gfid=10495074-82d0-4961-8212-5a4f32895f37 fd=0x7f68951a75bc (Stale file handle)
12453231:The message "W [MSGID: 114031] [client-rpc-fops.c:1572:client3_3_fstat_cbk] 6-volume1-client-0: remote operation failed [No such file or directory]" repeated 2 times between [2016-10-21 14:50:11.223949] and [2016-10-21 14:50:11.239505]
12453467:[2016-10-21 14:50:11.239646] W [MSGID: 114031] [client-rpc-fops.c:1572:client3_3_fstat_cbk] 6-volume1-client-1: remote operation failed [No such file or directory]
12453631-The message "W [MSGID: 108008] [afr-read-txn.c:244:afr_read_txn] 6-volume1-replicate-0: Unreadable subvolume -1 found with event generation 3 for gfid 10495074-82d0-4961-8212-5a4f32895f37. (Possible split-brain)" repeated 2 times between [2016-10-21 14:50:11.225264] and [2016-10-21 14:50:11.241102]
12453931:[2016-10-21 14:50:11.241441] W [MSGID: 114031] [client-rpc-fops.c:1572:client3_3_fstat_cbk] 6-volume1-client-0: remote operation failed [No such file or directory]
12454095-[2016-10-21 14:50:11.243704] W [fuse-bridge.c:2227:fuse_readv_cbk] 0-glusterfs-fuse: 622070416: READ => -1 gfid=10495074-82d0-4961-8212-5a4f32895f37 fd=0x7f68951a75bc (Stale file handle)
Below is the current volume status/configuration:
$ sudo gluster volume status
Status of volume: volume1
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick ip-172-25-2-91.us-west-1.compute.inte
rnal:/data/glusterfs/volume1/brick1/brick 49152 0 Y 26520
ernal:/data/glusterfs/volume1/brick1/brick 49152 0 Y 17782
ernal:/data/glusterfs/volume1/brick1/brick 49152 0 Y 7225
NFS Server on localhost 2049 0 Y 7245
Self-heal Daemon on localhost N/A N/A Y 7253
NFS Server on ip-172-25-2-206.us-west-1.com
pute.internal 2049 0 Y 17436
Self-heal Daemon on ip-172-25-2-206.us-west
-1.compute.internal N/A N/A Y 17456
NFS Server on ip-172-25-2-91.us-west-1.comp
ute.internal 2049 0 Y 10576
Self-heal Daemon on ip-172-25-2-91.us-west-
1.compute.internal N/A N/A Y 10610
Task Status of Volume volume1
------------------------------------------------------------------------------
There are no active volume tasks
$ sudo gluster volume info
Volume Name: volume1
Type: Replicate
Volume ID: 3bcca83e-2be5-410c-9a23-b159f570ee7e
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: ip-172-25-2-91.us-west-1.compute.internal:/data/glusterfs/volume1/brick1/brick
Brick2: ip-172-25-2-206.us-west-1.compute.internal:/data/glusterfs/volume1/brick1/brick
Brick3: ip-172-25-33-75.us-west-1.compute.internal:/data/glusterfs/volume1/brick1/brick <-- brick added a couple of days back
Options Reconfigured:
cluster.quorum-type: fixed
cluster.quorum-count: 2
$ From the client log: mnt-repos-volume1.log.1
1: volume volume1-client-0
2: type protocol/client
3: option clnt-lk-version 1
4: option volfile-checksum 0
5: option volfile-key /volume1
6: option client-version 3.7.15
7: option process-uuid production-collab-8-18739-2016/10/04-20:46:19:350684-volume1-client-0-6-0
8: option fops-version 1298437
9: option ping-timeout 42
10: option remote-host ip-172-25-2-91.us-west-1.compute.internal
11: option remote-subvolume /data/glusterfs/volume1/brick1/brick
12: option transport-type socket
13: option send-gids true
14: end-volume
15:
16: volume volume1-client-1
17: type protocol/client
18: option ping-timeout 42
19: option remote-host ip-172-25-2-206.us-west-1.compute.internal
20: option remote-subvolume /data/glusterfs/volume1/brick1/brick
21: option transport-type socket
22: option send-gids true
23: end-volume
24:
25: volume volume1-client-2
26: type protocol/client
27: option ping-timeout 42
28: option remote-host ip-172-25-33-75.us-west-1.compute.internal
29: option remote-subvolume /data/glusterfs/volume1/brick1/brick
30: option transport-type socket
31: option send-gids true
32: end-volume
33:
34: volume volume1-replicate-0
35: type cluster/replicate
36: option quorum-type fixed
37: option quorum-count 2
38: subvolumes volume1-client-0 volume1-client-1 volume1-client-2
39: end-volume
40:
41: volume volume1-dht
42: type cluster/distribute
43: subvolumes volume1-replicate-0
44: end-volume
45:
46: volume volume1-write-behind
47: type performance/write-behind
48: subvolumes volume1-dht
49: end-volume
50:
51: volume volume1-read-ahead
52: type performance/read-ahead
53: subvolumes volume1-write-behind
54: end-volume
55:
56: volume volume1-io-cache
57: type performance/io-cache
58: subvolumes volume1-read-ahead
59: end-volume
60:
61: volume volume1-quick-read
62: type performance/quick-read
63: subvolumes volume1-io-cache
64: end-volume
65:
66: volume volume1-open-behind
67: type performance/open-behind
68: subvolumes volume1-quick-read
69: end-volume
70:
71: volume volume1-md-cache
72: type performance/md-cache
73: option cache-posix-acl true
74: subvolumes volume1-open-behind
75: end-volume
76:
77: volume volume1
78: type debug/io-stats
79: option log-level INFO
80: option latency-measurement off
81: option count-fop-hits off
82: subvolumes volume1-md-cache
83: end-volume
84:
85: volume posix-acl-autoload
86: type system/posix-acl
87: subvolumes volume1
88: end-volume
89:
90: volume meta-autoload
91: type meta
92: subvolumes posix-acl-autoload
93: end-volume
94:
+------------------------------------------------------------------------------+
Thanks
Rama
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users