On Fri, Sep 23, 2011 at 6:19 PM, Brandon Simmons <bsimmons at labarchives.com> wrote: > On Fri, Sep 23, 2011 at 4:11 PM, Anand Babu Periasamy <ab at gluster.com> wrote: >> This is a known issue. Gluster NFS doesn't support NLM (locking) yet. 3.4 >> may implement this.? Did you try on GlusterFS native mount? > > Thanks for that information. > > I did test with the native fuse mount, but the results were difficult > to interpret. We have a rails application that writes to multiple > sqlite databases, and a test script that simulates a bunch of random > writes to a specified DB, retrying if it fails. > > On NFS this test runs reasonably well: both clients take turns, a > couple retries, all writes complete without failures. > > But mounted over gluster (same machines, underlying disk as above) one > client always runs while the other gets locked out (different client > machines depending on which was started first). At some point during > this test the client that was locked out from writing to the DB > actually gets disconnected from gluster and I have to remount: > > ? ?$ ls /mnt/gluster > ? ?ls: cannot access /websites/: Transport endpoint is not connected > > One client is consistently locked out even if they are writing to > DIFFERENT DBs altogether. Just a follow-up: my clients for the test above were 32-bit machines. Gluster seemed to compile fine and basically work, but after reading this I wonder whether that is the source of the issue: http://community.gluster.org/q/is-there-a-32bit-version-of-gluster/ I did a simplified test of concurrent writing to two different sqlite databases from two different 32-bit machines. I hoped they would exhibit the bad behavior I outlined above, pointing to a problem with 32-bit clients, but sadly that simplified test ran fine. Brandon > > The breakage of the mountpoint happened every time the test was run > concurrently against the SAME DB, but did not seem to occur when > clients were running against different DBs. > > But like I said, this was a very high level test with many moving > parts so I'm not sure how useful the above details are for you to > know. > > Happy to hear any ideas for testing, > Brandon > > /var/log/glusterfs/etc-glusterfs-glusterd.vol.log: > [2011-09-16 19:32:38.122196] W > [socket.c:1494:__socket_proto_state_machine] 0-socket.management: > reading from socket failed. Error (Transport endpoint is not > connected), peer (127.0.0.1:1017) > >> >> --AB >> >> On Sep 23, 2011 10:00 AM, "Brandon Simmons" <bsimmons at labarchives.com> >> wrote: >>> I am able to successfully mount a gluster volume using the NFS client >>> on my test servers. Simple reading and writing seems to work, but >>> trying to work with sqlite databases seems to cause the sqlite client >>> and libraries to freeze. I have to send KILL to stop the process. >>> >>> Here is an example, server 1 and 2 are clients mounting gluster volume >>> over NFS: >>> >>> server1# echo "working" > /mnt/gluster/test_simple >>> server2# echo "working" >> /mnt/gluster/test_simple >>> server1# cat /mnt/gluster/test_simple >>> working >>> working >>> server1# sqlite3 /websites/new.sqlite3 >>> SQLite version 3.6.10 >>> Enter ".help" for instructions >>> Enter SQL statements terminated with a ";" >>> sqlite> create table memos(text, priority INTEGER); >>> (...hangs forever, have to detach screen and do kill -9) >>> >>> the gluster volume was created and NFS-mounted as per the instructions >>> here: >>> >>> >>> http://www.gluster.com/community/documentation/index.php/Gluster_3.2_Filesystem_Administration_Guide >>> >>> If I mount the volume using the nolock option, then things work: >>> >>> mount -t nfs -o nolock server:/test-vol /mnt/gluster >>> >>> So I assume this has something to do with the locking RPC service >>> stufff, which I don't know much about. Here's output from rpc info: >>> >>> server# rpcinfo -p >>> program vers proto port >>> 100000 2 tcp 111 portmapper >>> 100000 2 udp 111 portmapper >>> 100024 1 udp 56286 status >>> 100024 1 tcp 40356 status >>> 100005 3 tcp 38465 mountd >>> 100005 1 tcp 38466 mountd >>> 100003 3 tcp 38467 nfs >>> >>> >>> client1# rpcinfo -p server >>> program vers proto port >>> 100000 2 tcp 111 portmapper >>> 100000 2 udp 111 portmapper >>> 100024 1 udp 56286 status >>> 100024 1 tcp 40356 status >>> 100005 3 tcp 38465 mountd >>> 100005 1 tcp 38466 mountd >>> 100003 3 tcp 38467 nfs >>> >>> client1# # rpcinfo -p >>> program vers proto port >>> 100000 2 tcp 111 portmapper >>> 100000 2 udp 111 portmapper >>> 100024 1 udp 32768 status >>> 100024 1 tcp 58368 status >>> >>> Thanks for any help, >>> Brandon >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users at gluster.org >>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users >> >