Shehjar, I've turned on debug logging for the server and client of GlusterFS and amended them below. I've replaced the IP addresses with "x.x.x" and left the last octet for security. It appears I'm able to lock the file when I run the program directly on the gluster mount point, just not when it's mounted via NFS. I checked and rpc.statd is running. One odd thing, when I run the perl locking program directly on the mnt point, it appears to work but spits out the following log message: [2010-05-03 09:16:39] D [read-ahead.c:468:ra_readv] readahead: unexpected offset (4096 != 362) resetting You can see where I tried it several times in the log file. Running the locking program on the nfs mount doesn't produce any extra error messages, it just fails. I tested this by mounting my server's NFS share on the same server (gluster02). Gluster01 (which is x.x.x.17) had the client & server running at time of testing, I also tried this without it running as well. I'm able to lock files on a normal NFS shares.. so this may be some kind of issue with FUSE and how it interacts with NFS. The version of FUSE. The version of FUSE I'm running was compiled/installed from http://ftp.gluster.com/pub/gluster/glusterfs/fuse/fuse-2.7.4glfs11.tar.g z Also, since last post I've upgraded Ubuntu to Lucid 10.4, to take advantage of the latest kernel in case there were fixes there. [console from gluster02, the NFS server] root at gluster02:/var/log/glusterfs# rpc.statd root at gluster02:/var/log/glusterfs# rpc.statd -V rpc.statd version 1.1.6 root at gluster02:/var/lib/nfs# cd /mnt/glusterfs/ root at gluster02:/mnt/glusterfs# ./locktest.pl Opened file lockfile Got shared lock on file lockfile Closed file lockfile root at gluster02:/mnt/glusterfs# ( ip's removed for security) root at gluster02:/mnt/glusterfs# exportfs -vfa exporting x.x.x.0/25:/mnt/test exporting x.x.x.0/25:/mnt/glusterfs exporting x.x.x.0/25:/exports/glusterfs root at gluster02:/mnt/glusterfs# mount x.x.x.16:/mnt/glusterfs /mnt/test root at gluster02:/mnt/glusterfs# cd /mnt/test root at gluster02:/mnt/test# ./locktest.pl Opened file lockfile Can't get shared lock on lockfile: No locks available [glusterfsd server logfile during above testing] root at gluster02:/var/log/glusterfs# /usr/sbin/glusterfsd -p /var/run/glusterfsd.pid -f /etc/glusterfs/glusterfsd.vol --log-file /var/log/glusterfsd.log --debug -N [2010-05-03 09:09:40] D [glusterfsd.c:424:_get_specfp] glusterfs: loading volume file /etc/glusterfs/glusterfsd.vol [2010-05-03 09:09:40] D [xlator.c:740:xlator_set_type] xlator: dlsym(notify) on /usr/local/lib/glusterfs/3.0.4/xlator/features/locks.so: undefined symbol: notify -- neglecting [2010-05-03 09:09:40] D [xlator.c:740:xlator_set_type] xlator: dlsym(notify) on /usr/local/lib/glusterfs/3.0.4/xlator/performance/io-threads.so: undefined symbol: notify -- neglecting [2010-05-03 09:09:40] D [xlator.c:745:xlator_set_type] xlator: dlsym(dumpops) on /usr/local/lib/glusterfs/3.0.4/xlator/performance/io-threads.so: undefined symbol: dumpops -- neglecting ======================================================================== ======== Version : glusterfs 3.0.2 built on Mar 23 2010 01:01:39 git: v3.0.2 Starting Time: 2010-05-03 09:10:20 Command line : /usr/sbin/glusterfsd -p /var/run/glusterfsd.pid -f /etc/glusterfs/glusterfsd.vol --log-file /var/log/glusterfsd.log --debug -N PID : 2231 System name : Linux Nodename : gluster02 Kernel Release : 2.6.32-21-generic-pae Hardware Identifier: i686 Given volfile: +----------------------------------------------------------------------- -------+ 1: volume posix 2: type storage/posix 3: option directory /data/export 4: end-volume 5: 6: volume locks 7: type features/locks 8: option manditory on 9: subvolumes posix 10: end-volume 11: 12: volume brick 13: type performance/io-threads 14: option thread-count 8 15: subvolumes locks 16: end-volume 17: 18: volume server 19: type protocol/server 20: option transport-type tcp 21: option auth.addr.brick.allow * 22: option auth.addr.brick-ns.allow * 23: option transport.socket.nodelay on 24: option auth.ip.locks.allow * 25: option auth.ip.locks-afr.allow * 26: subvolumes brick 27: end-volume +----------------------------------------------------------------------- -------+ [2010-05-03 09:10:20] D [glusterfsd.c:1370:main] glusterfs: running in pid 2231 [2010-05-03 09:10:20] D [transport.c:123:transport_load] transport: attempt to load file /usr/local/lib/glusterfs/3.0.4/transport/socket.so [2010-05-03 09:10:20] D [name.c:553:server_fill_address_family] server: option address-family not specified, defaulting to inet/inet6 [2010-05-03 09:10:20] W [server-protocol.c:6521:get_auth_types] server: assuming 'auth.ip' to be 'auth.addr' [2010-05-03 09:10:20] W [server-protocol.c:6521:get_auth_types] server: assuming 'auth.ip' to be 'auth.addr' [2010-05-03 09:10:20] D [io-threads.c:2841:init] brick: io-threads: Autoscaling: off, min_threads: 8, max_threads: 8 [2010-05-03 09:10:20] W [glusterfsd.c:540:_log_if_option_is_invalid] locks: option 'manditory' is not recognized [2010-05-03 09:10:20] N [glusterfsd.c:1396:main] glusterfs: Successfully started [2010-05-03 09:11:20] D [addr.c:190:gf_auth] brick: allowed = "*", received addr = "x.x.x.17" [2010-05-03 09:11:20] N [server-protocol.c:5852:mop_setvolume] server: accepted client from x.x.x.17:1021 [2010-05-03 09:11:20] D [addr.c:190:gf_auth] brick: allowed = "*", received addr = "x.x.x.17" [2010-05-03 09:11:20] N [server-protocol.c:5852:mop_setvolume] server: accepted client from x.x.x.17:1020 [2010-05-03 09:12:21] D [addr.c:190:gf_auth] brick: allowed = "*", received addr = "x.x.x.18" [2010-05-03 09:12:21] N [server-protocol.c:5852:mop_setvolume] server: accepted client from x.x.x.18:1021 [2010-05-03 09:12:21] D [addr.c:190:gf_auth] brick: allowed = "*", received addr = "x.x.x.18" [2010-05-03 09:12:21] N [server-protocol.c:5852:mop_setvolume] server: accepted client from x.x.x.18:1020 [2010-05-03 09:26:21] N [server-protocol.c:6788:notify] server: x.x.x.18:1021 disconnected [2010-05-03 09:26:21] N [server-protocol.c:6788:notify] server: x.x.x.18:1020 disconnected [2010-05-03 09:26:21] N [server-helpers.c:842:server_connection_destroy] server: destroyed connection of gluster02-2242-2010/05/03-09:12:21:41737-brick2 ^C [logfile from gluster client on same server] root at gluster02:~# /usr/sbin/glusterfs --log-level=DEBUG --volfile=/etc/glusterfs/glusterfs.vol /mnt/glusterfs --debug -N [2010-05-03 09:12:21] D [glusterfsd.c:424:_get_specfp] glusterfs: loading volume file /etc/glusterfs/glusterfs.vol [2010-05-03 09:12:21] D [xlator.c:740:xlator_set_type] xlator: dlsym(notify) on /usr/local/lib/glusterfs/3.0.4/xlator/performance/write-behind.so: undefined symbol: notify -- neglecting [2010-05-03 09:12:21] D [xlator.c:740:xlator_set_type] xlator: dlsym(notify) on /usr/local/lib/glusterfs/3.0.4/xlator/performance/io-cache.so: undefined symbol: notify -- neglecting [2010-05-03 09:12:21] D [xlator.c:745:xlator_set_type] xlator: dlsym(dumpops) on /usr/local/lib/glusterfs/3.0.4/xlator/performance/io-cache.so: undefined symbol: dumpops -- neglecting [2010-05-03 09:12:21] D [xlator.c:740:xlator_set_type] xlator: dlsym(notify) on /usr/local/lib/glusterfs/3.0.4/xlator/performance/read-ahead.so: undefined symbol: notify -- neglecting [2010-05-03 09:12:21] D [xlator.c:740:xlator_set_type] xlator: dlsym(notify) on /usr/local/lib/glusterfs/3.0.4/xlator/performance/io-threads.so: undefined symbol: notify -- neglecting [2010-05-03 09:12:21] D [xlator.c:745:xlator_set_type] xlator: dlsym(dumpops) on /usr/local/lib/glusterfs/3.0.4/xlator/performance/io-threads.so: undefined symbol: dumpops -- neglecting ======================================================================== ======== Version : glusterfs 3.0.2 built on Mar 23 2010 01:01:39 git: v3.0.2 Starting Time: 2010-05-03 09:12:21 Command line : /usr/sbin/glusterfs --log-level=DEBUG --volfile=/etc/glusterfs/glusterfs.vol /mnt/glusterfs --debug -N PID : 2242 System name : Linux Nodename : gluster02 Kernel Release : 2.6.32-21-generic-pae Hardware Identifier: i686 Given volfile: +----------------------------------------------------------------------- -------+ 1: volume brick1 2: type protocol/client 3: option transport-type tcp/client 4: option remote-host x.x.x.17 # IP address of the remote brick 5: option remote-subvolume brick # name of the remote volume 6: end-volume 7: 8: volume brick2 9: type protocol/client 10: option transport-type tcp/client 11: option remote-host x.x.x.18 # IP address of the remote brick 12: option remote-subvolume brick # name of the remote volume 13: end-volume 14: 15: volume afr1 16: type cluster/afr 17: subvolumes brick1 brick2 18: end-volume 19: 20: volume writebehind 21: type performance/write-behind 22: option window-size 4MB 23: subvolumes afr1 24: end-volume 25: 26: volume cache 27: type performance/io-cache 28: option cache-size 512MB 29: subvolumes writebehind 30: end-volume 31: 32: volume readahead 33: type performance/read-ahead 34: option page-size 128KB # unit in bytes 35: subvolumes cache 36: end-volume 37: 38: volume iothreads 39: type performance/io-threads 40: option thread-count 4 41: option cache-size 64MB 42: subvolumes readahead 43: end-volume +----------------------------------------------------------------------- -------+ [2010-05-03 09:12:21] D [glusterfsd.c:1370:main] glusterfs: running in pid 2242 [2010-05-03 09:12:21] W [xlator.c:656:validate_xlator_volume_options] writebehind: option 'window-size' is deprecated, preferred is 'cache-size', continuing with correction [2010-05-03 09:12:21] D [io-threads.c:2841:init] iothreads: io-threads: Autoscaling: off, min_threads: 4, max_threads: 4 [2010-05-03 09:12:21] D [write-behind.c:2526:init] writebehind: disabling write-behind for first 1 bytes [2010-05-03 09:12:21] D [client-protocol.c:6603:init] brick2: defaulting frame-timeout to 30mins [2010-05-03 09:12:21] D [client-protocol.c:6614:init] brick2: defaulting ping-timeout to 42 [2010-05-03 09:12:21] D [transport.c:123:transport_load] transport: attempt to load file /usr/local/lib/glusterfs/3.0.4/transport/socket.so [2010-05-03 09:12:21] D [transport.c:123:transport_load] transport: attempt to load file /usr/local/lib/glusterfs/3.0.4/transport/socket.so [2010-05-03 09:12:21] D [client-protocol.c:6603:init] brick1: defaulting frame-timeout to 30mins [2010-05-03 09:12:21] D [client-protocol.c:6614:init] brick1: defaulting ping-timeout to 42 [2010-05-03 09:12:21] D [transport.c:123:transport_load] transport: attempt to load file /usr/local/lib/glusterfs/3.0.4/transport/socket.so [2010-05-03 09:12:21] D [transport.c:123:transport_load] transport: attempt to load file /usr/local/lib/glusterfs/3.0.4/transport/socket.so [2010-05-03 09:12:21] D [client-protocol.c:7027:notify] brick1: got GF_EVENT_PARENT_UP, attempting connect on transport [2010-05-03 09:12:21] D [name.c:155:client_fill_address_family] brick1: address-family not specified, guessing it to be inet/inet6 [2010-05-03 09:12:21] D [client-protocol.c:7027:notify] brick1: got GF_EVENT_PARENT_UP, attempting connect on transport [2010-05-03 09:12:21] D [name.c:155:client_fill_address_family] brick1: address-family not specified, guessing it to be inet/inet6 [2010-05-03 09:12:21] D [client-protocol.c:7027:notify] brick2: got GF_EVENT_PARENT_UP, attempting connect on transport [2010-05-03 09:12:21] D [name.c:155:client_fill_address_family] brick2: address-family not specified, guessing it to be inet/inet6 [2010-05-03 09:12:21] D [client-protocol.c:7027:notify] brick2: got GF_EVENT_PARENT_UP, attempting connect on transport [2010-05-03 09:12:21] D [name.c:155:client_fill_address_family] brick2: address-family not specified, guessing it to be inet/inet6 [2010-05-03 09:12:21] W [glusterfsd.c:540:_log_if_option_is_invalid] iothreads: option 'cache-size' is not recognized [2010-05-03 09:12:21] W [glusterfsd.c:540:_log_if_option_is_invalid] readahead: option 'page-size' is not recognized [2010-05-03 09:12:21] D [client-protocol.c:7027:notify] brick1: got GF_EVENT_PARENT_UP, attempting connect on transport [2010-05-03 09:12:21] D [client-protocol.c:7027:notify] brick1: got GF_EVENT_PARENT_UP, attempting connect on transport [2010-05-03 09:12:21] D [client-protocol.c:7027:notify] brick2: got GF_EVENT_PARENT_UP, attempting connect on transport [2010-05-03 09:12:21] D [client-protocol.c:7027:notify] brick2: got GF_EVENT_PARENT_UP, attempting connect on transport [2010-05-03 09:12:21] N [glusterfsd.c:1396:main] glusterfs: Successfully started [2010-05-03 09:12:21] D [client-protocol.c:7041:notify] brick1: got GF_EVENT_CHILD_UP [2010-05-03 09:12:21] D [client-protocol.c:7041:notify] brick1: got GF_EVENT_CHILD_UP [2010-05-03 09:12:21] D [client-protocol.c:7041:notify] brick2: got GF_EVENT_CHILD_UP [2010-05-03 09:12:21] D [client-protocol.c:7041:notify] brick2: got GF_EVENT_CHILD_UP [2010-05-03 09:12:21] N [client-protocol.c:6246:client_setvolume_cbk] brick2: Connected to x.x.x.18:6996, attached to remote volume 'brick'. [2010-05-03 09:12:21] N [afr.c:2632:notify] afr1: Subvolume 'brick2' came back up; going online. [2010-05-03 09:12:21] D [fuse-bridge.c:3100:fuse_thread_proc] fuse: pthread_cond_timedout returned non zero value ret: 0 errno: 0 [2010-05-03 09:12:21] N [client-protocol.c:6246:client_setvolume_cbk] brick2: Connected to x.x.x.18:6996, attached to remote volume 'brick'. [2010-05-03 09:12:21] N [fuse-bridge.c:2950:fuse_init] glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.13 kernel 7.13 [2010-05-03 09:12:21] N [afr.c:2632:notify] afr1: Subvolume 'brick2' came back up; going online. [2010-05-03 09:12:21] N [client-protocol.c:6246:client_setvolume_cbk] brick1: Connected to x.x.x.17:6996, attached to remote volume 'brick'. [2010-05-03 09:12:21] N [client-protocol.c:6246:client_setvolume_cbk] brick1: Connected to x.x.x.17:6996, attached to remote volume 'brick'. [2010-05-03 09:16:39] D [read-ahead.c:468:ra_readv] readahead: unexpected offset (4096 != 362) resetting [2010-05-03 09:26:04] D [read-ahead.c:468:ra_readv] readahead: unexpected offset (4096 != 362) resetting [2010-05-03 09:26:21] N [fuse-bridge.c:3140:fuse_thread_proc] glusterfs-fuse: terminating upon getting ENODEV when reading /dev/fuse [2010-05-03 09:26:21] N [fuse-bridge.c:3208:fuse_thread_proc] fuse: unmounting /mnt/glusterfs [2010-05-03 09:26:21] W [glusterfsd.c:953:cleanup_and_exit] glusterfs: shutting down ^Croot at gluster02:~# -- Ian Steinmetz -----Original Message----- From: Shehjar Tikoo [mailto:shehjart at gluster.com] Sent: Sunday, May 02, 2010 11:22 PM To: Steinmetz, Ian Cc: gluster-users at gluster.org Subject: Re: NFS file locks There are a couple of things we can do: -> Mail us the Glusterfs log files from the NFS server and the glusterfs servers when the lock script fails. Do file a bug if you can. -> On the NFS client machine, before you run the mount command, make sure you run the following command. $ rpc.statd -> Run the same perl script but this time at the nfs server over the glusterfs mount point, not at the NFS client. If it runs fine, it is probably related to locking over NFS and we'll look at other places to figure it out. -Shehjar Steinmetz, Ian wrote: > I'm seeing an issue where I can't lock files on a NFS exported > GlusterFS mount. I have two servers connected to each other doing AFR > to provide a high available NFS server (mirror the content, one VIP for > NFS mounts to clients). Both of the servers have mounted > "/mnt/glusterfs" using GlusterFS with the client pointing to both > servers. I then export the filesyste with NFS. I grabbed a quick perl > program that tries to lock a file for testing, which fails only on the > glusterfs. When I export a normal directory "/mnt/test" the locking > works. > Any ideas appreciated. I have a feeling I've implemented the > posix/locks option incorrectly. > > Both servers are running Ubuntu with identical setups, below are > relevant configs. > root at gluster01:/mnt/glusterfs# uname -a > Linux gluster01 2.6.31-20-generic-pae #58-Ubuntu SMP Fri Mar 12 06:25:51 > UTC 2010 i686 GNU/Linux > > root at gluster01:/mnt/glusterfs# cat /etc/exports > /mnt/glusterfs <ip removed for > security>/25(rw,no_root_squash,no_all_squash,no_subtree_check,sync,insec > ure,fsid=10) > /mnt/test <ip removed for > security>/25(rw,no_root_squash,no_all_squash,no_subtree_check,sync,insec > ure,fsid=11) > > * I've tried async, rsync, removing all options except FSID. > > root at gluster02:/etc/glusterfs# cat glusterfs.vol > volume brick1 > type protocol/client > option transport-type tcp/client > option remote-host <ip removed for security> # IP address of the remote > brick > option remote-subvolume brick # name of the remote volume > end-volume > > volume brick2 > type protocol/client > option transport-type tcp/client > option remote-host <ip removed for security> # IP address of the > remote brick > option remote-subvolume brick # name of the remote volume > end-volume > > volume afr1 > type cluster/afr > subvolumes brick1 brick2 > end-volume > > volume writebehind > type performance/write-behind > option window-size 4MB > subvolumes afr1 > end-volume > > volume cache > type performance/io-cache > option cache-size 512MB > subvolumes writebehind > end-volume > > volume readahead > type performance/read-ahead > option page-size 128KB # unit in bytes > subvolumes cache > end-volume > > volume iothreads > type performance/io-threads > option thread-count 4 > option cache-size 64MB > subvolumes readahead > end-volume > > root at gluster02:/etc/glusterfs# cat glusterfsd.vol > volume posix > type storage/posix > option directory /data/export > end-volume > > volume locks > type features/posix-locks > option manditory on # tried with and without this, found in a search > of earlier post > subvolumes posix > end-volume > > volume brick > type performance/io-threads > option thread-count 8 > subvolumes locks > end-volume > > volume server > type protocol/server > option transport-type tcp > option auth.addr.brick.allow * > option auth.addr.brick-ns.allow * > option transport.socket.nodelay on > option auth.ip.locks.allow * > subvolumes brick > end-volume > > > * file to test locking... > root at gluster02:/mnt/glusterfs# cat locktest.pl > #!/usr/bin/perl > > use Fcntl qw(:flock); > > my $lock_file = 'lockfile'; > > open(LOCKFILE,">>$lock_file") or die "Cannot open $lock_file: $!\n"; > print "Opened file $lock_file\n"; > flock(LOCKFILE, LOCK_SH) or die "Can't get shared lock on $lock_file: > $!\n"; > print "Got shared lock on file $lock_file\n"; > sleep 2; > close LOCKFILE; > print "Closed file $lock_file\n"; > > exit; > > *Test run from gluster02 using normal NFS mount: > root at gluster02:/# mount <ip removed for security>:/mnt/test /mnt/test > root at gluster02:/# cd /mnt/test > root at gluster02:/mnt/test# ./locktest.pl > Opened file lockfile > Got shared lock on file lockfile > Closed file lockfile > > *Test run from gluster02 using gluster exported NFS mount: > root at gluster02:/# mount 74.81.128.17:/mnt/glusterfs /mnt/test > root at gluster02:/# cd /mnt/test > root at gluster02:/mnt/test# ./locktest.pl > Opened file lockfile > Can't get shared lock on lockfile: > No locks available > > -- > Ian Steinmetz > Comcast Engineering - Houston > 713-375-7866 > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users