Avati - thanks for your reply, my comments below >> [name.c:251:af_inet_client_get_remote_sockaddr] glusterfs: DNS >> resolution failed on host /etc/glusterfs/glusterfs.vol > Please make sure you are able to resolve hostnames as given in volume info > in all of your servers via 'dig'. The logs clearly show that host resolution > seems to be failing. Agreed, however that does seem to be the issue because I can dig the host (they're all defined in my hosts file too so it doesn't have to look them up) named clustr-02 and in fact there are 23 other 'bricks' on that host that are working fine: # gluster volume info | grep clustr-02 Brick2: clustr-02:/mnt/data01 Brick8: clustr-02:/mnt/data02 Brick14: clustr-02:/mnt/data03 Brick20: clustr-02:/mnt/data04 Brick26: clustr-02:/mnt/data05 Brick32: clustr-02:/mnt/data06 Brick38: clustr-02:/mnt/data07 Brick44: clustr-02:/mnt/data08 Brick50: clustr-02:/mnt/data09 Brick56: clustr-02:/mnt/data10 Brick62: clustr-02:/mnt/data11 Brick68: clustr-02:/mnt/data12 Brick74: clustr-02:/mnt/data13 Brick80: clustr-02:/mnt/data14 Brick86: clustr-02:/mnt/data15 Brick92: clustr-02:/mnt/data16 Brick98: clustr-02:/mnt/data17 Brick104: clustr-02:/mnt/data18 Brick110: clustr-02:/mnt/data19 Brick116: clustr-02:/mnt/data20 Brick122: clustr-02:/mnt/data21 Brick128: clustr-02:/mnt/data22 Brick134: clustr-02:/mnt/data23 Brick140: clustr-02:/mnt/data24 I logged into that host, unmounted that mount, ran fsck.ext4 on it, but it came back clean. Also thing, the log says: "glusterfs: DNS >> resolution failed on host /etc/glusterfs/glusterfs.vol" - however, there is obviously no host named /etc/glusterfs/glusterfs.vol - does this point to an issue? And lastly, I even have a file named /etc/glusterfs/glusterfs.vol" ls -ls /etc/glusterfs -rw-r--r-- 1 root root 229 Jan 16 21:15 glusterd.vol -rw-r--r-- 1 root root 1908 Jan 16 21:15 glusterfsd.vol.sample -rw-r--r-- 1 root root 2005 Jan 16 21:15 glusterfs.vol.sample I created all of the configs via the gluster> commandline tool. Thanks P On Thu, Feb 3, 2011 at 6:39 PM, Anand Avati <anand.avati at gmail.com> wrote: > Please make sure you are able to resolve hostnames as given in volume info > in all of your servers via 'dig'. The logs clearly show that host resolution > seems to be failing. > Avati > > On Thu, Feb 3, 2011 at 1:08 PM, phil cryer <phil at cryer.us> wrote: >> >> This wasn't my issue, but I'm still having the issue. Today I purged >> glusterfs 3.1.1 and installed 3.1.2 fresh from deb. I recreated my >> volume, started it, everything was going fine, mounted the share, then >> ran df -h to see it, now every few seconds my logs posts this: >> >> ==> /var/log/glusterfs/nfs.log <== >> [2011-02-03 15:55:57.145626] E >> [client-handshake.c:1079:client_query_portmap_cbk] >> bhl-volume-client-98: failed to get the port number for remote >> subvolume >> [2011-02-03 15:55:57.145694] I [client.c:1590:client_rpc_notify] >> bhl-volume-client-98: disconnected >> >> ==> /var/log/glusterfs/mnt-glusterfs.log <== >> [2011-02-03 15:55:57.605802] E [common-utils.c:124:gf_resolve_ip6] >> resolver: getaddrinfo failed (Name or service not known) >> [2011-02-03 15:55:57.605834] E >> [name.c:251:af_inet_client_get_remote_sockaddr] glusterfs: DNS >> resolution failed on host /etc/glusterfs/glusterfs.vol >> >> over and over. Any clues as to how I can fix this? This one issue has >> made our entire 100TB store unusable. >> >> and again, gluster volume info shows all the bricks are OK, including 98: >> >> gluster> volume info >> >> Volume Name: bhl-volume >> Type: Distributed-Replicate >> Status: Started >> Number of Bricks: 72 x 2 = 144 >> Transport-type: tcp >> Bricks: >> [...] >> Brick92: clustr-02:/mnt/data16 >> Brick93: clustr-03:/mnt/data16 >> Brick94: clustr-04:/mnt/data16 >> Brick95: clustr-05:/mnt/data16 >> Brick96: clustr-06:/mnt/data16 >> Brick97: clustr-01:/mnt/data17 >> Brick98: clustr-02:/mnt/data17 >> Brick99: clustr-03:/mnt/data17 >> Brick100: clustr-04:/mnt/data17 >> Brick101: clustr-05:/mnt/data17 >> Brick102: clustr-06:/mnt/data17 >> Brick103: clustr-01:/mnt/data18 >> Brick104: clustr-02:/mnt/data18 >> Brick105: clustr-03:/mnt/data18 >> [...] >> >> >> P >> >> >> On Mon, Jan 31, 2011 at 4:26 PM, Anand Avati <anand.avati at gmail.com> >> wrote: >> > Can you post your server logs? What happens if you run 'df -k' on your >> > backend export filesystems? >> > >> > Thanks >> > Avati >> > >> > On Mon, Jan 17, 2011 at 5:27 AM, Joe Warren-Meeks >> > <joe at encoretickets.co.uk>wrote: >> > >> >> >> >> (sorry about topposting.) >> >> >> >> Just changing the timeout would only mask the problem. The real issue >> >> is >> >> that running 'df' on either node causes a hang. >> >> >> >> All other operations seem fine, files can be created and deleted as >> >> normal with the results showing up on both. >> >> >> >> I'd like to work out why it's hanging on df so I can fix it and get my >> >> monitoring and cron scripts running again :) >> >> >> >> ?-- joe. >> >> >> >> -----Original Message----- >> >> From: gluster-users-bounces at gluster.org >> >> [mailto:gluster-users-bounces at gluster.org] On Behalf Of Daniel Maher >> >> Sent: 17 January 2011 12:48 >> >> To: gluster-users at gluster.org >> >> Subject: Re: df causes hang >> >> >> >> On 01/17/2011 10:47 AM, Joe Warren-Meeks wrote: >> >> > Hey chaps, >> >> > >> >> > Anyone got any pointers as to what this might be? This is still >> >> causing >> >> > a lot of problems for us whenever we attempt to do df. >> >> > >> >> > ? -- joe. >> >> > >> >> > -----Original Message----- >> >> >> >> > However, for some reason, they've got into a bit of a state such that >> >> > typing 'df -k' causes both to hang, resulting in a loss of service >> >> for42 >> >> > seconds. I see the following messages in the log files: >> >> > >> >> > >> >> >> >> 42 seconds is the default tcp timeout time for any given node - you >> >> could try tuning that down and seeing how it works for you. >> >> >> >> >> >> http://www.gluster.com/community/documentation/index.php/Gluster_3.1:_Se >> >> tting_Volume_Options >> >> >> >> >> >> -- >> >> Daniel Maher <dma+gluster AT witbe DOT net> >> >> _______________________________________________ >> >> Gluster-users mailing list >> >> Gluster-users at gluster.org >> >> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users >> >> >> >> >> >> _______________________________________________ >> >> Gluster-users mailing list >> >> Gluster-users at gluster.org >> >> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users >> >> >> > >> > _______________________________________________ >> > Gluster-users mailing list >> > Gluster-users at gluster.org >> > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users >> > >> > >> >> >> >> -- >> http://philcryer.com > > -- http://philcryer.com