So… tried the
nfs.addr-namelookup option, same result. Maybe slightly better response, although this is subjective. On the windows side I use the nfs client, not cifs! When I try connecting via the ip of the server, I get the same result… I looked a bit through the gluster config files in var, and while I’m definitely no expert, it all kind of looks correct to me. What’s going on???? So weird. From: Carlos Capriotti [mailto:capriotti.carlos@xxxxxxxxx]
I have a bad feeling about this. Sounds like you may have two problems: name resolution and some corrupt or conflicting info on gluster's config files. Those that are under /var. I have NO idea how gluster's NFS server does name resolution, but it is clear is is wrong, and not working wright. There is an option that disables nfs name resolution on volumes, regarding peers, which I use to try speeding up NFS access: nfs.addr-namelookup off You might want to give this a try. Now, it does NOT help with smb name resolution. If you try, from Windows, or even linux, to access the smb share via IP, does it get the right volume ? On Thu, Apr 3, 2014 at 3:13 PM, VAN CAUSBROECK Wannes <Wannes.VANCAUSBROECK@xxxxxxxxxxxxxx> wrote: Now it gets even better : If I try to open
\\lpr-nas01\caviar_data11 on windows, it automatically opens caviar_data1… which is a completely different volume! Same goes for caviar_data12. Those are the two volumes I deleted and recreated on another disk. Again, mounting as
a gluster filesystem works fine, and when mounting on linux as nfs it behaves strangely. What’s also weird in linux as an nfs mount: if I do an ls, it works fine, but the more
directories I ‘ls’, or when I do a lot of recursive ‘ls’s, the listing of the directory contents slows down and even freezes all the time. Could this be related to some kind of timeout or buffer or whatever that gets full on
the server side? From: Carlos
Capriotti [mailto:capriotti.carlos@xxxxxxxxx]
Wannes: It is funny the way life keeps "playing" with us. I used to live in Belgium until about 6 months ago, working for Kodak. Anyway, back to your problem, I think you have already destroyed and re-created those volumes, if memory serves me well, but I could be wrong. The fact is that gluster creates several configuration files under /var (and I have their guts for this), and they are somewhat complex. In the past I had to change the IP of gluster nodes, on multihomed servers, and I was not able to make gluster work anymore. Because it was a test structure I just scrapped the
entire volume and started over. I would try to (re) create the volume if possible. If not, use the same "physical" mountpoint, create a new folder below that, and create a new volume with that, but this time,
use the Ip addresses of the nodes instead of the names. I am a bit paranoid about name resolution, so I tend to hard code IPs everywhere. Your case DOES ring a bell. You might want to search the list in the last couple of weeks. Not farther than 6 weeks, which is more or less the time I am part of it. Now, if you get the chance to update your Os to 6.5, that might be beneficial also. Cheers. On Thu, Apr 3, 2014 at 1:22 PM, VAN CAUSBROECK Wannes <Wannes.VANCAUSBROECK@xxxxxxxxxxxxxx> wrote: Hi Carlos, Belgian government, indeed J
(pension fund to be more precise) Did: mount -t nfs -o mountproto=tcp,vers=3 localhost:/caviar_data11 /media when I do an ls of the directory, I still get the same weird kind of directory listing. In the logs: [2014-04-03 11:16:58.005527] E [nfs3-helpers.c:3595:nfs3_fh_resolve_inode_lookup_cbk]
0-nfs-nfsv3: Lookup failed: <gfid:7553c77d-884b-4e28-a3ae-330b3a24b055>: Invalid argument [2014-04-03 11:16:58.005559] E [acl3.c:334:acl3_getacl_resume] 0-nfs-ACL: Unable to
resolve FH: (127.0.0.1:762) caviar_data11 : 7553c77d-884b-4e28-a3ae-330b3a24b055 [2014-04-03 11:16:58.005577] E [acl3.c:342:acl3_getacl_resume] 0-nfs-ACL: unable to
open_and_resume [2014-04-03 11:16:58.005814] E [dht-helper.c:429:dht_subvol_get_hashed] (-->/usr/lib64/glusterfs/3.4.2/xlator/debug/io-stats.so(io_stats_lookup+0x157)
[0x7f4465bf52e7] (-->/usr/lib64/libglusterfs.so.0(default_lookup+0x6d) [0x3dfe01c03d] (-->/usr/lib64/glusterfs/3.4.2/xlator/cluster/distribute.so(dht_lookup+0xa7e) [0x7f4466037f2e]))) 0-caviar_data11-dht: invalid argument: loc->parent [2014-04-03 11:16:58.006087] W [client-rpc-fops.c:2624:client3_3_lookup_cbk] 0-caviar_data11-client-0:
remote operation failed: Invalid argument. Path: <gfid:00000000-0000-0000-0000-000000000000> (00000000-0000-0000-0000-000000000000) [2014-04-03 11:16:58.006145] E [acl3.c:334:acl3_getacl_resume] 0-nfs-ACL: Unable to
resolve FH: (127.0.0.1:762) caviar_data11 : 00000000-0000-0000-0000-000000000000 [2014-04-03 11:16:58.006158] E [acl3.c:342:acl3_getacl_resume] 0-nfs-ACL: unable to
open_and_resume I’m searching in some bug reports as well at the same time but this is a bit over my
head :D From: Carlos
Capriotti [mailto:capriotti.carlos@xxxxxxxxx]
Geez. Belgian Government ! ;) Ok. How about mounting the NFS share as localhost ? I know it looks like you name resolution IS working, but logs say otherwise. That is why I am insisting. also, adding server names/IPs to your hosts file won't hurt for a test, but start simple: Mount the NFS as localhost and let's see how it behaves. On Thu, Apr 3, 2014 at 12:21 PM, VAN CAUSBROECK Wannes <Wannes.VANCAUSBROECK@xxxxxxxxxxxxxx> wrote: Hi Carlos, i’ve got this : [root@lpr-nas01 ~]# cat /etc/hosts 127.0.0.1 localhost [root@lpr-nas01 ~]# ifconfig eth0 Link encap:Ethernet HWaddr 00:50:56:BD:35:75
inet addr:192.168.151.21 Bcast:192.168.151.255 Mask:255.255.255.0 inet6 addr: fe80::250:56ff:febd:3575/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:902244243 errors:0 dropped:0 overruns:0 frame:0 TX packets:892724740 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000
RX bytes:940529143125 (875.9 GiB) TX bytes:845739949663 (787.6 GiB) lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:54121225 errors:0 dropped:0 overruns:0 frame:0 TX packets:54121225 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0
RX bytes:530181912508 (493.7 GiB) TX bytes:530181912508 (493.7 GiB) [root@lpr-nas01 ~]# ping lpr-nas01 PING
lpr-nas01.onprvp.fgov.be (192.168.151.21) 56(84) bytes of data. 64 bytes from
lpr-nas01.onprvp.fgov.be (192.168.151.21): icmp_seq=1 ttl=64 time=0.023 ms [root@lpr-nas01 ~]# nslookup 192.168.151.21 Server: 192.168.147.31 Address: 192.168.147.31#53 21.151.168.192.in-addr.arpa name =
lpr-nas01.onprvp.fgov.be. Regards, Wannes From: Carlos
Capriotti [mailto:capriotti.carlos@xxxxxxxxx]
Also, if you can post the contents of your hosts file and the output of ifconfig, that would be nice. Sounds like you are facing a bad ass name resolution issue. Nodes cannot find each other. On Thu, Apr 3, 2014 at 11:05 AM, VAN CAUSBROECK Wannes <Wannes.VANCAUSBROECK@xxxxxxxxxxxxxx> wrote: Hello Carlos, I created a new disk formatted xfs with an inode size of 512. I created a new gluster volume and migrated the data. Again, when I mount it as a gluster
volume, everything works fine. On nfs (mounted on the same server that’s running the gluster volume) I get the following
weirdness: [root@lpr-nas01 /]# mount -t nfs -o mountproto=tcp,vers=3 lpr-nas01:/caviar_data11 /media [root@lpr-nas01 /]# ll /media/*/ …. /media/2012/201206: ls: /media/2012/201206/20120621: No such file or directory total 0 drwxrwsr-x 3 960 1003 15 Jun 15 2011 20120621 …. drwxrwsr-x 2 nfsnobody 1003 4096 Jun 6 2011 38 drwxrwsr-x 2 nfsnobody 1003 4096 Jun 6 2011 39 drwxrwsr-x 2 nfsnobody 1003 138 Jun 6 2011 40 drwxrwsr-x 2 nfsnobody 1003 4096 Jun 6 2011 42 drwxrwsr-x 2 nfsnobody 1003 171 Jun 6 2011 43 drwxrwsr-x 2 nfsnobody 1003 4096 Jun 6 2011 45 drwxrwsr-x 2 nfsnobody 1003 4096 Jun 6 2011 46 drwxrwsr-x 2 nfsnobody 1003 4096 Jun 6 2011 47 drwxrwsr-x 2 nfsnobody 1003 369 Jun 6 2011 48 … ls: cannot access /media/2011/201106/20110606/81: Invalid argument ls: cannot access /media/2011/201106/20110606/55: Invalid argument ls: cannot access /media/2011/201106/20110606/30: Invalid argument ls: cannot access /media/2011/201106/20110606/90: Invalid argument total 12 d????????? ? ? ? ? ? 00 d????????? ? ? ? ? ? 01 d????????? ? ? ? ? ? 02 d????????? ? ? ? ? ? 03 d????????? ? ? ? ? ? 04 d????????? ? ? ? ? ? 05 d????????? ? ? ? ? ? 06 d????????? ? ? ? ? ? 07 d????????? ? ? ? ? ? 08 d????????? ? ? ? ? ? 09 d????????? ? ? ? ? ? 10 …. In the nfs log, I get the following errors: … [2014-04-03 08:50:13.624517] E [nfs3-helpers.c:3595:nfs3_fh_resolve_inode_lookup_cbk]
0-nfs-nfsv3: Lookup failed: <gfid:a1acf77c-2b81-4b5f-a113-521c6ab8fd23>: Invalid argument [2014-04-03 08:50:13.624547] E [acl3.c:334:acl3_getacl_resume] 0-nfs-ACL: Unable to
resolve FH: (192.168.151.21:954) caviar_data11 : a1acf77c-2b81-4b5f-a113-521c6ab8fd23 [2014-04-03 08:50:13.624562] E [acl3.c:342:acl3_getacl_resume] 0-nfs-ACL: unable to
open_and_resume [2014-04-03 08:50:13.624914] E [nfs3-helpers.c:3595:nfs3_fh_resolve_inode_lookup_cbk]
0-nfs-nfsv3: Lookup failed: <gfid:980a29f7-29f4-4c88-9896-2b1a549370e2>: Invalid argument [2014-04-03 08:50:13.624960] E [acl3.c:334:acl3_getacl_resume] 0-nfs-ACL: Unable to
resolve FH: (192.168.151.21:954) caviar_data11 : 980a29f7-29f4-4c88-9896-2b1a549370e2 [2014-04-03 08:50:13.624970] E [acl3.c:342:acl3_getacl_resume] 0-nfs-ACL: unable to
open_and_resume [2014-04-03 08:50:13.625290] E [nfs3-helpers.c:3595:nfs3_fh_resolve_inode_lookup_cbk]
0-nfs-nfsv3: Lookup failed: <gfid:b67993d2-c647-4493-aa78-64033614dc33>: Invalid argument [2014-04-03 08:50:13.625322] E [nfs3.c:755:nfs3_getattr_resume] 0-nfs-nfsv3: Unable
to resolve FH: (192.168.151.21:954) caviar_data11 : b67993d2-c647-4493-aa78-64033614dc33 [2014-04-03 08:50:13.625335] W [nfs3-helpers.c:3380:nfs3_log_common_res] 0-nfs-nfsv3:
XID: c838d715, GETATTR: NFS: 22(Invalid argument for operation), POSIX: 14(Bad address) [2014-04-03 08:50:13.626268] I [dht-layout.c:638:dht_layout_normalize] 0-caviar_data11-dht:
found anomalies in /2011/201108/20110802. holes=1 overlaps=0 [2014-04-03 08:50:13.627701] I [dht-layout.c:638:dht_layout_normalize] 0-caviar_data11-dht:
found anomalies in /2011/201108/20110803. holes=1 overlaps=0 [2014-04-03 08:50:13.628839] E [nfs3-helpers.c:3595:nfs3_fh_resolve_inode_lookup_cbk]
0-nfs-nfsv3: Lookup failed: <gfid:bb587d95-ffa3-42f3-9b27-b1cf0c5c05cb>: Invalid argument …. [2014-04-03 08:50:13.706106] W [nfs3-helpers.c:3380:nfs3_log_common_res] 0-nfs-nfsv3:
XID: 6639d715, GETATTR: NFS: 22(Invalid argument for operation), POSIX: 14(Bad address) [2014-04-03 08:50:13.706585] E [nfs3-helpers.c:3595:nfs3_fh_resolve_inode_lookup_cbk]
0-nfs-nfsv3: Lookup failed: <gfid:897bf28d-41a3-4cc2-a120-8374656be858>: Invalid argument [2014-04-03 08:50:13.706626] E [nfs3.c:755:nfs3_getattr_resume] 0-nfs-nfsv3: Unable
to resolve FH: (192.168.151.21:954) caviar_data11 : 897bf28d-41a3-4cc2-a120-8374656be858 [2014-04-03 08:50:13.706647] W [nfs3-helpers.c:3380:nfs3_log_common_res] 0-nfs-nfsv3:
XID: 6739d715, GETATTR: NFS: 22(Invalid argument for operation), POSIX: 14(Bad address) [2014-04-03 08:50:13.707113] E [nfs3-helpers.c:3595:nfs3_fh_resolve_inode_lookup_cbk]
0-nfs-nfsv3: Lookup failed: <gfid:ded9c3fe-171b-4104-bc0e-e64c0d3e18e2>: Invalid argument [2014-04-03 08:50:13.707158] E [nfs3.c:755:nfs3_getattr_resume] 0-nfs-nfsv3: Unable
to resolve FH: (192.168.151.21:954) caviar_data11 : ded9c3fe-171b-4104-bc0e-e64c0d3e18e2 [2014-04-03 08:50:13.707174] W [nfs3-helpers.c:3380:nfs3_log_common_res] 0-nfs-nfsv3:
XID: 6839d715, GETATTR: NFS: 22(Invalid argument for operation), POSIX: 14(Bad address) [2014-04-03 08:50:13.707597] E [nfs3-helpers.c:3595:nfs3_fh_resolve_inode_lookup_cbk]
0-nfs-nfsv3: Lookup failed: <gfid:c5ebeb07-ee2e-4161-9e65-65d65f614628>: Invalid argument [2014-04-03 08:50:13.707644] E [nfs3.c:755:nfs3_getattr_resume] 0-nfs-nfsv3: Unable
to resolve FH: (192.168.151.21:954) caviar_data11 : c5ebeb07-ee2e-4161-9e65-65d65f614628 [2014-04-03 08:50:13.707707] W [nfs3-helpers.c:3380:nfs3_log_common_res] 0-nfs-nfsv3:
XID: 6939d715, GETATTR: NFS: 22(Invalid argument for operation), POSIX: 14(Bad address) [2014-04-03 08:50:13.708357] E [nfs3-helpers.c:3595:nfs3_fh_resolve_inode_lookup_cbk]
0-nfs-nfsv3: Lookup failed: <gfid:da363aa4-6d43-47c0-90fe-edead3689064>: Invalid argument [2014-04-03 08:50:13.708407] E [nfs3.c:755:nfs3_getattr_resume] 0-nfs-nfsv3: Unable
to resolve FH: (192.168.151.21:954) caviar_data11 : da363aa4-6d43-47c0-90fe-edead3689064 [2014-04-03 08:50:13.708432] W [nfs3-helpers.c:3380:nfs3_log_common_res] 0-nfs-nfsv3:
XID: 6a39d715, GETATTR: NFS: 22(Invalid argument for operation), POSIX: 14(Bad address) [2014-04-03 08:50:13.708796] E [dht-helper.c:429:dht_subvol_get_hashed] (-->/usr/lib64/glusterfs/3.4.2/xlator/debug/io-stats.so(io_stats_lookup+0x157)
[0x7f4465bf52e7] (-->/usr/lib64/libglusterfs.so.0(default_lookup+0x6d) [0x3dfe01c03d] (-->/usr/lib64/glusterfs/3.4.2/xlator/cluster/distribute.so(dht_lookup+0xa7e) [0x7f4466037f2e]))) 0-caviar_data11-dht: invalid argument: loc->parent [2014-04-03 08:50:13.709066] W [client-rpc-fops.c:2624:client3_3_lookup_cbk] 0-caviar_data11-client-0:
remote operation failed: Invalid argument. Path: <gfid:00000000-0000-0000-0000-000000000000> (00000000-0000-0000-0000-000000000000) [2014-04-03 08:50:13.709128] E [acl3.c:334:acl3_getacl_resume] 0-nfs-ACL: Unable to
resolve FH: (192.168.151.21:954) caviar_data11 : 00000000-0000-0000-0000-000000000000 [2014-04-03 08:50:13.709159] E [acl3.c:342:acl3_getacl_resume] 0-nfs-ACL: unable to
open_and_resume [2014-04-03 08:50:13.709741] E [nfs3-helpers.c:3595:nfs3_fh_resolve_inode_lookup_cbk]
0-nfs-nfsv3: Lookup failed: <gfid:db0eec80-e122-4751-9695-d903a6e6f29e>: Invalid argument …. So….. any ideas? From: Carlos
Capriotti [mailto:capriotti.carlos@xxxxxxxxx]
Sent: maandag 31 maart 2014 18:03
maybe it would be nice to see your volume info for affected volumes. Also, on the server side, what happens if you mount the share using glusterfs instead of nfs ? any change the native nfs server is running on your server ? Are there any auto-heal processes running ? There are a few name resolution messages on your logs, that seem to refer to the nodes themselves. Any DNS conflicts ? Maybe add the names of servers to the hosts file ? You MS client seems to be having issues with user/group translation. It seems to create files with gid 1003. (I could be wrong). Again, is SElinux/ACLs/iptables disabled ? All is very inconclusive os far. On Mon, Mar 31, 2014 at 5:26 PM, VAN CAUSBROECK Wannes <Wannes.VANCAUSBROECK@xxxxxxxxxxxxxx> wrote: Well, with 'client' i do actually mean the server itself. i've tried forcing linux and windows to nfs V3 and tcp, and on windows i played around with the uid and gid, but the result is always the same
|
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://supercolony.gluster.org/mailman/listinfo/gluster-users