Hi, Replying to myself with some more details : The servers are 64bit (x86_64) whereas the clients are 32bit (ix86). It seems like this could be the cause of this problem... http://oss.sgi.com/archives/xfs/2009-07/msg00044.html But if the glusterfs client doesn't know about the original inodes of the files, then it should be possible to fix, right? Matthias Matthias Saou wrote : > Hi, > > (Note: I have access to the systems referenced in the initial post) > > I think I've found the problem. It's the filesystem, XFS, which has > been mounted with the "inode64" option, as it can't be mounted without > since it has been grown to 39TB. I've just checked this : > > # ls -1 -ai /file/data/cust | sort -n > > And the last few lines are like this : > > [...] > 2148235729 cust2 > 2148236297 cust6 > 2148236751 cust5 > 2148236974 cust7 > 2148237729 cust3 > 2148239365 cust4 > 2156210172 cust8 > 61637541899 cust1 > 96636784146 cust9 > > Note that "cust1" here is the one where the problem has been seen > initially. I've just checked, and the "cust9" directory is affected in > the exact same way. > > So it seems like the glusterfs build being used has problems with 64bit > inodes. Is this a known limitation? Is it easy to fix or work around? > > Matthias > > Roger Torrentsgenerós wrote : > > > > > We have 2 servers, let's name them file01 and file02. They are synced > > very frequently, so we can assume contents are the same. Then we have > > lots of clients, everyone of each has two glusterfs mountings, one > > against every file server. > > > > Before you ask, let me say the clients are in a production environment, > > where I can't afford any downtime. To make the migration from glusterfs > > v1.3 to glusterfs v2.0 as smooth as possible, I recompiled the packages > > to run under glusterfs2 name. Servers are running two instances of the > > glusterfs daemon, and the old one is to be stopped when all the > > migration is complete. So you'll be seeing some glusterfs2 and build > > dates that may not be normal, but you'll also see this has nothing to do > > with this matter. > > > > file01 server log: > > > > ================================================================================ > > Version : glusterfs 2.0.1 built on May 26 2009 05:11:51 > > TLA Revision : 5c1d9108c1529a1155963cb1911f8870a674ab5b > > Starting Time: 2009-07-14 18:07:12 > > Command line : /usr/sbin/glusterfsd2 -p /var/run/glusterfsd2.pid > > PID : 6337 > > System name : Linux > > Nodename : file01 > > Kernel Release : 2.6.18-128.1.14.el5 > > Hardware Identifier: x86_64 > > > > Given volfile: > > +------------------------------------------------------------------------------+ > > 1: # The data store directory to serve > > 2: volume filedata-ds > > 3: type storage/posix > > 4: option directory /file/data > > 5: end-volume > > 6: > > 7: # Make the data store read-only > > 8: volume filedata-readonly > > 9: type testing/features/filter > > 10: option read-only on > > 11: subvolumes filedata-ds > > 12: end-volume > > 13: > > 14: # Optimize > > 15: volume filedata-iothreads > > 16: type performance/io-threads > > 17: option thread-count 64 > > 18: # option autoscaling on > > 19: # option min-threads 16 > > 20: # option max-threads 256 > > 21: subvolumes filedata-readonly > > 22: end-volume > > 23: > > 24: # Add readahead feature > > 25: volume filedata > > 26: type performance/read-ahead # cache per file = (page-count x > > page-size) > > 27: # option page-size 256kB # 256KB is the default option ? > > 28: # option page-count 8 # 16 is default option ? > > 29: subvolumes filedata-iothreads > > 30: end-volume > > 31: > > 32: # Main server section > > 33: volume server > > 34: type protocol/server > > 35: option transport-type tcp > > 36: option transport.socket.listen-port 6997 > > 37: subvolumes filedata > > 38: option auth.addr.filedata.allow 192.168.128.* # streamers > > 39: option verify-volfile-checksum off # don't have clients complain > > 40: end-volume > > 41: > > > > +------------------------------------------------------------------------------+ > > [2009-07-14 18:07:12] N [glusterfsd.c:1152:main] glusterfs: Successfully > > started > > > > file02 server log: > > > > ================================================================================ > > Version : glusterfs 2.0.1 built on May 26 2009 05:11:51 > > TLA Revision : 5c1d9108c1529a1155963cb1911f8870a674ab5b > > Starting Time: 2009-06-28 08:42:13 > > Command line : /usr/sbin/glusterfsd2 -p /var/run/glusterfsd2.pid > > PID : 5846 > > System name : Linux > > Nodename : file02 > > Kernel Release : 2.6.18-92.1.10.el5 > > Hardware Identifier: x86_64 > > > > Given volfile: > > +------------------------------------------------------------------------------+ > > 1: # The data store directory to serve > > 2: volume filedata-ds > > 3: type storage/posix > > 4: option directory /file/data > > 5: end-volume > > 6: > > 7: # Make the data store read-only > > 8: volume filedata-readonly > > 9: type testing/features/filter > > 10: option read-only on > > 11: subvolumes filedata-ds > > 12: end-volume > > 13: > > 14: # Optimize > > 15: volume filedata-iothreads > > 16: type performance/io-threads > > 17: option thread-count 64 > > 18: # option autoscaling on > > 19: # option min-threads 16 > > 20: # option max-threads 256 > > 21: subvolumes filedata-readonly > > 22: end-volume > > 23: > > 24: # Add readahead feature > > 25: volume filedata > > 26: type performance/read-ahead # cache per file = (page-count x > > page-size) > > 27: # option page-size 256kB # 256KB is the default option ? > > 28: # option page-count 8 # 16 is default option ? > > 29: subvolumes filedata-iothreads > > 30: end-volume > > 31: > > 32: # Main server section > > 33: volume server > > 34: type protocol/server > > 35: option transport-type tcp > > 36: option transport.socket.listen-port 6997 > > 37: subvolumes filedata > > 38: option auth.addr.filedata.allow 192.168.128.* # streamers > > 39: option verify-volfile-checksum off # don't have clients complain > > 40: end-volume > > 41: > > > > +------------------------------------------------------------------------------+ > > [2009-06-28 08:42:13] N [glusterfsd.c:1152:main] glusterfs: Successfully > > started > > > > Now let's pick a random client, for example streamer013, and see its > > log: > > > > ================================================================================ > > Version : glusterfs 2.0.1 built on May 26 2009 05:23:52 > > TLA Revision : 5c1d9108c1529a1155963cb1911f8870a674ab5b > > Starting Time: 2009-07-22 18:34:31 > > Command line : /usr/sbin/glusterfs2 --log-level=NORMAL > > --volfile-server=file02.priv --volfile-server-port=6997 /mnt/file02 > > PID : 14519 > > System name : Linux > > Nodename : streamer013 > > Kernel Release : 2.6.18-92.1.10.el5PAE > > Hardware Identifier: i686 > > > > Given volfile: > > +------------------------------------------------------------------------------+ > > 1: # filedata > > 2: volume filedata > > 3: type protocol/client > > 4: option transport-type tcp > > 5: option remote-host file02.priv > > 6: option remote-port 6997 # use non default to run in > > parallel > > 7: option remote-subvolume filedata > > 8: end-volume > > 9: > > 10: # Add readahead feature > > 11: volume readahead > > 12: type performance/read-ahead # cache per file = (page-count x > > page-size) > > 13: # option page-size 256kB # 256KB is the default option ? > > 14: # option page-count 2 # 16 is default option ? > > 15: subvolumes filedata > > 16: end-volume > > 17: > > 18: # Add threads > > 19: volume iothreads > > 20: type performance/io-threads > > 21: option thread-count 8 > > 22: # option autoscaling on > > 23: # option min-threads 16 > > 24: # option max-threads 256 > > 25: subvolumes readahead > > 26: end-volume > > 27: > > 28: # Add IO-Cache feature > > 29: volume iocache > > 30: type performance/io-cache > > 31: option cache-size 64MB # default is 32MB (in 1.3) > > 32: option page-size 256KB # 128KB is default option (in 1.3) > > 33: subvolumes iothreads > > 34: end-volume > > 35: > > > > +------------------------------------------------------------------------------+ > > [2009-07-22 18:34:31] N [glusterfsd.c:1152:main] glusterfs: Successfully > > started > > [2009-07-22 18:34:31] N [client-protocol.c:5557:client_setvolume_cbk] > > filedata: Connected to 192.168.128.232:6997, attached to remote volume > > 'filedata'. > > [2009-07-22 18:34:31] N [client-protocol.c:5557:client_setvolume_cbk] > > filedata: Connected to 192.168.128.232:6997, attached to remote volume > > 'filedata'. > > > > The mountings seem ok: > > > > [root@streamer013 /]# mount > > /dev/mapper/VolGroup00-LogVol00 on / type ext3 (rw) > > proc on /proc type proc (rw) > > sysfs on /sys type sysfs (rw) > > devpts on /dev/pts type devpts (rw,gid=5,mode=620) > > /dev/sda1 on /boot type ext3 (rw) > > tmpfs on /dev/shm type tmpfs (rw) > > none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw) > > glusterfs#file01.priv on /mnt/file01 type fuse > > (rw,max_read=131072,allow_other,default_permissions) > > glusterfs#file02.priv on /mnt/file02 type fuse > > (rw,max_read=131072,allow_other,default_permissions) > > > > They work: > > > > [root@streamer013 /]# ls /mnt/file01/ > > cust > > [root@streamer013 /]# ls /mnt/file02/ > > cust > > > > And they are seen by both servers: > > > > file01: > > > > [2009-07-22 18:34:19] N [server-helpers.c:723:server_connection_destroy] > > server: destroyed connection of streamer013. > > p4.bt.bcn.flumotion.net-14335-2009/07/22-18:34:13:210609-filedata > > [2009-07-22 18:34:31] N [server-protocol.c:7796:notify] server: > > 192.168.128.213:1017 disconnected > > [2009-07-22 18:34:31] N [server-protocol.c:7796:notify] server: > > 192.168.128.213:1018 disconnected > > [2009-07-22 18:34:31] N [server-protocol.c:7035:mop_setvolume] server: > > accepted client from 192.168.128.213:1017 > > [2009-07-22 18:34:31] N [server-protocol.c:7035:mop_setvolume] server: > > accepted client from 192.168.128.213:1018 > > > > file02: > > > > [2009-07-22 18:34:20] N [server-helpers.c:723:server_connection_destroy] > > server: destroyed connection of streamer013. > > p4.bt.bcn.flumotion.net-14379-2009/07/22-18:34:13:267495-filedata > > [2009-07-22 18:34:31] N [server-protocol.c:7796:notify] server: > > 192.168.128.213:1014 disconnected > > [2009-07-22 18:34:31] N [server-protocol.c:7796:notify] server: > > 192.168.128.213:1015 disconnected > > [2009-07-22 18:34:31] N [server-protocol.c:7035:mop_setvolume] server: > > accepted client from 192.168.128.213:1015 > > [2009-07-22 18:34:31] N [server-protocol.c:7035:mop_setvolume] server: > > accepted client from 192.168.128.213:1014 > > > > Now let's see the funny things. First, a content listing of a particular > > directory, locally from both servers: > > > > [root@file01 ~]# ls /file/data/cust/cust1 > > configs files outgoing reports > > > > [root@file02 ~]# ls /file/data/cust/cust1 > > configs files outgoing reports > > > > Now let's try to see the same from the client side: > > > > [root@streamer013 /]# ls /mnt/file01/cust/cust1 > > ls: /mnt/file01/cust/cust1: No such file or directory > > [root@streamer013 /]# ls /mnt/file02/cust/cust1 > > configs files outgoing reports > > > > Oops :( And the client log says: > > > > [2009-07-22 18:41:22] W [fuse-bridge.c:1651:fuse_opendir] > > glusterfs-fuse: 64: OPENDIR (null) (fuse_loc_fill() failed) > > > > While none of the servers logs say anything. > > > > So files really exist in the servers, but the same client can see them > > in one of the filers but not in the other, although both are running > > exactly the same software. But there's more. It seems it only happens > > for certain directories (I can't show you the contents due to privacity, > > but I guess you'll figure it out): > > > > [root@streamer013 /]# ls /mnt/file01/cust/|wc -l > > 95 > > [root@streamer013 /]# ls /mnt/file02/cust/|wc -l > > 95 > > [root@streamer013 /]# for i in `ls /mnt/file01/cust/`; do > > ls /mnt/file01/cust/$i; done|grep such > > ls: /mnt/file01/cust/cust1: No such file or directory > > ls: /mnt/file01/cust/cust2: No such file or directory > > [root@streamer013 /]# for i in `ls /mnt/file02/cust/`; do > > ls /mnt/file02/cust/$i; done|grep such > > [root@streamer013 /]# > > > > And of course, our client log error twice: > > > > [2009-07-22 18:49:21] W [fuse-bridge.c:1651:fuse_opendir] > > glusterfs-fuse: 2119: OPENDIR (null) (fuse_loc_fill() failed) > > [2009-07-22 18:49:21] W [fuse-bridge.c:1651:fuse_opendir] > > glusterfs-fuse: 2376: OPENDIR (null) (fuse_loc_fill() failed) > > > > > > I hope having been clear enough this time. If you need more data just > > let me know and I'll see what I can do. > > > > And thanks again for your help. > > > > Roger > > > > > > On Wed, 2009-07-22 at 09:10 -0700, Anand Avati wrote: > > > > I have been witnessing some strange behaviour with my GlusterFS system. > > > > Fact is there are some files which exist and are completely accessible > > > > in the server, while they can't be accessed from a client, while other > > > > files do. > > > > > > > > To be sure, I copied the same files to another directory and I still was > > > > unable to see them from the client. To be sure it wasn't any kind of > > > > file permissions, selinux or whatever issue, I created a copy from a > > > > working directory, and still wasn't seen from the client. All I get is > > > > a: > > > > > > > > ls: .: No such file or directory > > > > > > > > And the client log says: > > > > > > > > [2009-07-22 14:04:18] W [fuse-bridge.c:1651:fuse_opendir] > > > > glusterfs-fuse: 104778: OPENDIR (null) (fuse_loc_fill() failed) > > > > > > > > While the server log says nothing. > > > > > > > > Funniest thing is the same client has another GlusterFS mount to another > > > > server, which has exactly the same contents as the first one, and this > > > > mount does work. > > > > > > > > Some data: > > > > > > > > [root@streamer001 /]# ls /mnt/file01/cust/cust1/ > > > > ls: /mnt/file01/cust/cust1/: No such file or directory > > > > > > > > [root@streamer001 /]# ls /mnt/file02/cust/cust1/ > > > > configs files outgoing reports > > > > > > > > [root@streamer001 /]# mount > > > > /dev/mapper/VolGroup00-LogVol00 on / type ext3 (rw) > > > > proc on /proc type proc (rw) > > > > sysfs on /sys type sysfs (rw) > > > > devpts on /dev/pts type devpts (rw,gid=5,mode=620) > > > > /dev/sda1 on /boot type ext3 (rw) > > > > tmpfs on /dev/shm type tmpfs (rw) > > > > none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw) > > > > sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw) > > > > glusterfs#file01.priv on /mnt/file01 type fuse > > > > (rw,max_read=131072,allow_other,default_permissions) > > > > glusterfs#file02.priv on /mnt/file02 type fuse > > > > (rw,max_read=131072,allow_other,default_permissions) > > > > > > > > [root@file01 /]# ls /file/data/cust/cust1 > > > > configs files outgoing reports > > > > > > > > [root@file02 /]# ls /file/data/cust/cust1 > > > > configs files outgoing reports > > > > > > > > Any ideas? > > > > > > Can you please post all your client and server logs and volfiles? Are > > > you quite certain that this is not a result of some misconfiguration? > > > > > > Avati > > > > > > _______________________________________________ > > Gluster-devel mailing list > > Gluster-devel@xxxxxxxxxx > > http://lists.nongnu.org/mailman/listinfo/gluster-devel > > -- Clean custom Red Hat Linux rpm packages : http://freshrpms.net/ Fedora release 10 (Cambridge) - Linux kernel 2.6.27.25-170.2.72.fc10.x86_64 Load : 0.50 3.32 2.58