Ah! you must be mounting it wrong.. please mount it from a server (not using volfile) mount -t glusterfs SERVER:/vol /mnt or glusterfs -s SERVER --volfile-id vol /mnt that should fix it Avati On Thu, Feb 3, 2011 at 7:07 PM, phil cryer <phil at cryer.us> wrote: > Avati - thanks for your reply, my comments below > > >> [name.c:251:af_inet_client_get_remote_sockaddr] glusterfs: DNS > >> resolution failed on host /etc/glusterfs/glusterfs.vol > > > Please make sure you are able to resolve hostnames as given in volume > info > > in all of your servers via 'dig'. The logs clearly show that host > resolution > > seems to be failing. > > Agreed, however that does seem to be the issue because I can dig the > host (they're all defined in my hosts file too so it doesn't have to > look them up) named clustr-02 and in fact there are 23 other 'bricks' > on that host that are working fine: > > # gluster volume info | grep clustr-02 > Brick2: clustr-02:/mnt/data01 > Brick8: clustr-02:/mnt/data02 > Brick14: clustr-02:/mnt/data03 > Brick20: clustr-02:/mnt/data04 > Brick26: clustr-02:/mnt/data05 > Brick32: clustr-02:/mnt/data06 > Brick38: clustr-02:/mnt/data07 > Brick44: clustr-02:/mnt/data08 > Brick50: clustr-02:/mnt/data09 > Brick56: clustr-02:/mnt/data10 > Brick62: clustr-02:/mnt/data11 > Brick68: clustr-02:/mnt/data12 > Brick74: clustr-02:/mnt/data13 > Brick80: clustr-02:/mnt/data14 > Brick86: clustr-02:/mnt/data15 > Brick92: clustr-02:/mnt/data16 > Brick98: clustr-02:/mnt/data17 > Brick104: clustr-02:/mnt/data18 > Brick110: clustr-02:/mnt/data19 > Brick116: clustr-02:/mnt/data20 > Brick122: clustr-02:/mnt/data21 > Brick128: clustr-02:/mnt/data22 > Brick134: clustr-02:/mnt/data23 > Brick140: clustr-02:/mnt/data24 > > I logged into that host, unmounted that mount, ran fsck.ext4 on it, > but it came back clean. > > Also thing, the log says: "glusterfs: DNS >> resolution failed on host > /etc/glusterfs/glusterfs.vol" - however, there is obviously no host > named /etc/glusterfs/glusterfs.vol - does this point to an issue? > > And lastly, I even have a file named /etc/glusterfs/glusterfs.vol" > > ls -ls /etc/glusterfs > -rw-r--r-- 1 root root 229 Jan 16 21:15 glusterd.vol > -rw-r--r-- 1 root root 1908 Jan 16 21:15 glusterfsd.vol.sample > -rw-r--r-- 1 root root 2005 Jan 16 21:15 glusterfs.vol.sample > > I created all of the configs via the gluster> commandline tool. > > Thanks > > P > > > > > On Thu, Feb 3, 2011 at 6:39 PM, Anand Avati <anand.avati at gmail.com> wrote: > > Please make sure you are able to resolve hostnames as given in volume > info > > in all of your servers via 'dig'. The logs clearly show that host > resolution > > seems to be failing. > > Avati > > > > On Thu, Feb 3, 2011 at 1:08 PM, phil cryer <phil at cryer.us> wrote: > >> > >> This wasn't my issue, but I'm still having the issue. Today I purged > >> glusterfs 3.1.1 and installed 3.1.2 fresh from deb. I recreated my > >> volume, started it, everything was going fine, mounted the share, then > >> ran df -h to see it, now every few seconds my logs posts this: > >> > >> ==> /var/log/glusterfs/nfs.log <== > >> [2011-02-03 15:55:57.145626] E > >> [client-handshake.c:1079:client_query_portmap_cbk] > >> bhl-volume-client-98: failed to get the port number for remote > >> subvolume > >> [2011-02-03 15:55:57.145694] I [client.c:1590:client_rpc_notify] > >> bhl-volume-client-98: disconnected > >> > >> ==> /var/log/glusterfs/mnt-glusterfs.log <== > >> [2011-02-03 15:55:57.605802] E [common-utils.c:124:gf_resolve_ip6] > >> resolver: getaddrinfo failed (Name or service not known) > >> [2011-02-03 15:55:57.605834] E > >> [name.c:251:af_inet_client_get_remote_sockaddr] glusterfs: DNS > >> resolution failed on host /etc/glusterfs/glusterfs.vol > >> > >> over and over. Any clues as to how I can fix this? This one issue has > >> made our entire 100TB store unusable. > >> > >> and again, gluster volume info shows all the bricks are OK, including > 98: > >> > >> gluster> volume info > >> > >> Volume Name: bhl-volume > >> Type: Distributed-Replicate > >> Status: Started > >> Number of Bricks: 72 x 2 = 144 > >> Transport-type: tcp > >> Bricks: > >> [...] > >> Brick92: clustr-02:/mnt/data16 > >> Brick93: clustr-03:/mnt/data16 > >> Brick94: clustr-04:/mnt/data16 > >> Brick95: clustr-05:/mnt/data16 > >> Brick96: clustr-06:/mnt/data16 > >> Brick97: clustr-01:/mnt/data17 > >> Brick98: clustr-02:/mnt/data17 > >> Brick99: clustr-03:/mnt/data17 > >> Brick100: clustr-04:/mnt/data17 > >> Brick101: clustr-05:/mnt/data17 > >> Brick102: clustr-06:/mnt/data17 > >> Brick103: clustr-01:/mnt/data18 > >> Brick104: clustr-02:/mnt/data18 > >> Brick105: clustr-03:/mnt/data18 > >> [...] > >> > >> > >> P > >> > >> > >> On Mon, Jan 31, 2011 at 4:26 PM, Anand Avati <anand.avati at gmail.com> > >> wrote: > >> > Can you post your server logs? What happens if you run 'df -k' on your > >> > backend export filesystems? > >> > > >> > Thanks > >> > Avati > >> > > >> > On Mon, Jan 17, 2011 at 5:27 AM, Joe Warren-Meeks > >> > <joe at encoretickets.co.uk>wrote: > >> > > >> >> > >> >> (sorry about topposting.) > >> >> > >> >> Just changing the timeout would only mask the problem. The real issue > >> >> is > >> >> that running 'df' on either node causes a hang. > >> >> > >> >> All other operations seem fine, files can be created and deleted as > >> >> normal with the results showing up on both. > >> >> > >> >> I'd like to work out why it's hanging on df so I can fix it and get > my > >> >> monitoring and cron scripts running again :) > >> >> > >> >> -- joe. > >> >> > >> >> -----Original Message----- > >> >> From: gluster-users-bounces at gluster.org > >> >> [mailto:gluster-users-bounces at gluster.org] On Behalf Of Daniel Maher > >> >> Sent: 17 January 2011 12:48 > >> >> To: gluster-users at gluster.org > >> >> Subject: Re: df causes hang > >> >> > >> >> On 01/17/2011 10:47 AM, Joe Warren-Meeks wrote: > >> >> > Hey chaps, > >> >> > > >> >> > Anyone got any pointers as to what this might be? This is still > >> >> causing > >> >> > a lot of problems for us whenever we attempt to do df. > >> >> > > >> >> > -- joe. > >> >> > > >> >> > -----Original Message----- > >> >> > >> >> > However, for some reason, they've got into a bit of a state such > that > >> >> > typing 'df -k' causes both to hang, resulting in a loss of service > >> >> for42 > >> >> > seconds. I see the following messages in the log files: > >> >> > > >> >> > > >> >> > >> >> 42 seconds is the default tcp timeout time for any given node - you > >> >> could try tuning that down and seeing how it works for you. > >> >> > >> >> > >> >> > http://www.gluster.com/community/documentation/index.php/Gluster_3.1:_Se > >> >> tting_Volume_Options > >> >> > >> >> > >> >> -- > >> >> Daniel Maher <dma+gluster AT witbe DOT net> > >> >> _______________________________________________ > >> >> Gluster-users mailing list > >> >> Gluster-users at gluster.org > >> >> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users > >> >> > >> >> > >> >> _______________________________________________ > >> >> Gluster-users mailing list > >> >> Gluster-users at gluster.org > >> >> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users > >> >> > >> > > >> > _______________________________________________ > >> > Gluster-users mailing list > >> > Gluster-users at gluster.org > >> > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users > >> > > >> > > >> > >> > >> > >> -- > >> http://philcryer.com > > > > > > > > -- > http://philcryer.com >