error: "Transport endpoint is not connected" and "Stale NFS file handle"

jcanelas at co.sapo.pt (José Manuel Canelas) · Tue, 2 Mar 2010 16:08:32 +0000

Hi,

Since no one replies to this, i'll reply to myself :)

I just realized I assumed that it is possible to replicate distributed
volumes. I am wrong?

In my setup bellow I was trying to make "Replicated Distributed
Storage", the inverse of what is described in
http://www.gluster.com/community/documentation/index.php/Distributed_Replicated_Storage.

Trying to draw a picture:

	replicated
-------------|------------  <----> 3 replicas presented as one volume
replica1 replica2 replica3
---|---------|-------|----  <-----> 4 volumes, distributed, to make up
4vols	   4vols   4vols	each of the 3 volumes to be replicated

Is this dumb or is there a better way?

thanks,
Jos? Canelas

On 02/26/2010 03:55 PM, Jos? Manuel Canelas wrote:
> Hello, everyone.
> 
> We're setting up GlusterFS for some testing and having some trouble with
> the configuration.
> 
> We have 4 nodes as clients and servers, 4 disks each. I'm trying to
> setup 3 replicas across all those 16 disks, configured at the client
> side, for high availability and optimal performance, in a way that makes
> it easy to add new disks and nodes.
> 
> The best way I thought doing it was to put disks together from different
> nodes into 3 distributed volumes and then use each of those as a replica
> of the top volume. I'd like your input on this too, so if you look at
> the configuration and something looks wrong or dumb, it probably is, so
> please let me know :)
> 
> Now the server config looks like this:
> 
> volume posix1
>   type storage/posix
>   option directory /srv/gdisk01
> end-volume
> 
> volume locks1
>     type features/locks
>     subvolumes posix1
> end-volume
> 
> volume brick1
>     type performance/io-threads
>     option thread-count 8
>     subvolumes locks1
> end-volume
> 
> [4 more identical bricks and...]
> 
> volume server-tcp
>     type protocol/server
>     option transport-type tcp
>     option auth.addr.brick1.allow *
>     option auth.addr.brick2.allow *
>     option auth.addr.brick3.allow *
>     option auth.addr.brick4.allow *
>     option transport.socket.listen-port 6996
>     option transport.socket.nodelay on
>     subvolumes brick1 brick2 brick3 brick4
> end-volume
> 
> 
> The client config:
> 
> volume node01-1
>     type protocol/client
>     option transport-type tcp
>     option remote-host node01
>     option transport.socket.nodelay on
>     option transport.remote-port 6996
>     option remote-subvolume brick1
> end-volume
> 
> [repeated for every brick, until node04-4]
> 
> ### Our 3 replicas
> volume repstore1
>     type cluster/distribute
>     subvolumes node01-1 node02-1 node03-1 node04-1 node04-4
> end-volume
> 
> volume repstore2
>     type cluster/distribute
>     subvolumes node01-2 node02-2 node03-2 node04-2 node02-2
> end-volume
> 
> volume repstore3
>     type cluster/distribute
>     subvolumes node01-3 node02-3 node03-3 node04-3 node03-3
> end-volume
> 
> volume replicate
>     type cluster/replicate
>     subvolumes repstore1 repstore2 repstore3
> end-volume
> 
> [and then the performance bits]
> 
> 
> When starting the glusterfs server, everything looks fine. I then mount
> the filesystem with
> 
> node01:~# glusterfs --debug -f /etc/glusterfs/glusterfs.vol
> /srv/gluster-export
> 
> and it does not complain and shows up as properly mounted. When
> accessing the content, it gives back an error, that the "Transport
> endpoint is not connected". The log has a "Stale NFS file handle"
> warning. See bellow:
> 
> [...]
> [2010-02-26 14:56:01] D [dht-common.c:274:dht_revalidate_cbk] repstore3:
> mismatching layouts for /
> [2010-02-26 14:56:01] W [fuse-bridge.c:722:fuse_attr_cbk]
> glusterfs-fuse: 9: LOOKUP() / => -1 (Stale NFS file handle)
> 
> 
> node01:~# mount
> /dev/cciss/c0d0p1 on / type ext3 (rw,errors=remount-ro)
> tmpfs on /lib/init/rw type tmpfs (rw,nosuid,mode=0755)
> proc on /proc type proc (rw,noexec,nosuid,nodev)
> sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
> procbususb on /proc/bus/usb type usbfs (rw)
> udev on /dev type tmpfs (rw,mode=0755)
> tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
> devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=620)
> fusectl on /sys/fs/fuse/connections type fusectl (rw)
> /dev/cciss/c0d1 on /srv/gdisk01 type ext3 (rw,errors=remount-ro)
> /dev/cciss/c0d2 on /srv/gdisk02 type ext3 (rw,errors=remount-ro)
> /dev/cciss/c0d3 on /srv/gdisk03 type ext3 (rw,errors=remount-ro)
> /dev/cciss/c0d4 on /srv/gdisk04 type ext3 (rw,errors=remount-ro)
> /etc/glusterfs/glusterfs.vol on /srv/gluster-export type fuse.glusterfs
> (rw,allow_other,default_permissions,max_read=131072)
> node01:~# ls /srv/gluster-export
> ls: cannot access /srv/gluster-export: Transport endpoint is not connected
> node01:~#
> 
> 
> The complete debug log and configuration files are attached.
> 
> Thank you in advance,
> Jos? Canelas
> 
>