>>> However, if I do a gluster volume info I see that it's listed: >>> # gluster volume info | grep 98 >>> Brick98: clustr-02:/mnt/data17 But now I'm thinking this is wrong because while it says clustr-02, the error stops occurring when I stop clustr-03. So how do I really know, not only what host it's on, but what brick each mount is on? (/mnt/data* in my case) In other words, does bhl-volume-client-98 != Brick98: clustr-02:/mnt/data17 ? and if not, how can I tell which brick is bhl-volume-client-98? P On Fri, Feb 4, 2011 at 1:49 PM, phil cryer <phil at cryer.us> wrote: > On Fri, Feb 4, 2011 at 12:33 PM, Anand Avati <anand.avati at gmail.com> wrote: >> It is very likely the brick process is failing to start. Please look at the >> brick log on that server. (in /var/log/glusterfs/bricks/* ) >> Avati > > Thanks, so if I'm looking at it right, the 'bhl-volume-client-98' is > really Brick98: clustr-02:/mnt/data17 - I'm figuring that from this: > >>> [2011-02-04 13:09:28.407300] I [client.c:1590:client_rpc_notify] >>> bhl-volume-client-98: disconnected >>> >>> However, if I do a gluster volume info I see that it's listed: >>> # gluster volume info | grep 98 >>> Brick98: clustr-02:/mnt/data17 > > But on that server I don't see any issues with that brick starting: > > # head mnt-data17.log -n50 > [2011-02-03 23:29:24.235648] W [graph.c:274:gf_add_cmdline_options] > bhl-volume-server: adding option 'listen-port' for volume > 'bhl-volume-server' with value '24025' > [2011-02-03 23:29:24.236017] W > [rpc-transport.c:566:validate_volume_options] tcp.bhl-volume-server: > option 'listen-port' is deprecated, preferred is > 'transport.socket.listen-port', continuing with correction > Given volfile: > +------------------------------------------------------------------------------+ > ?1: volume bhl-volume-posix > ?2: ? ? type storage/posix > ?3: ? ? option directory /mnt/data17 > ?4: end-volume > ?5: > ?6: volume bhl-volume-access-control > ?7: ? ? type features/access-control > ?8: ? ? subvolumes bhl-volume-posix > ?9: end-volume > ?10: > ?11: volume bhl-volume-locks > ?12: ? ? type features/locks > ?13: ? ? subvolumes bhl-volume-access-control > ?14: end-volume > ?15: > ?16: volume bhl-volume-io-threads > ?17: ? ? type performance/io-threads > ?18: ? ? subvolumes bhl-volume-locks > ?19: end-volume > ?20: > ?21: volume /mnt/data17 > ?22: ? ? type debug/io-stats > ?23: ? ? subvolumes bhl-volume-io-threads > ?24: end-volume > ?25: > ?26: volume bhl-volume-server > ?27: ? ? type protocol/server > ?28: ? ? option transport-type tcp > ?29: ? ? option auth.addr./mnt/data17.allow * > ?30: ? ? subvolumes /mnt/data17 > ?31: end-volume > > +------------------------------------------------------------------------------+ > [2011-02-03 23:29:28.575630] I > [server-handshake.c:535:server_setvolume] bhl-volume-server: accepted > client from 128.128.164.219:724 > [2011-02-03 23:29:28.583169] I > [server-handshake.c:535:server_setvolume] bhl-volume-server: accepted > client from 127.0.1.1:985 > [2011-02-03 23:29:28.603357] I > [server-handshake.c:535:server_setvolume] bhl-volume-server: accepted > client from 128.128.164.218:726 > [2011-02-03 23:29:28.605650] I > [server-handshake.c:535:server_setvolume] bhl-volume-server: accepted > client from 128.128.164.217:725 > [2011-02-03 23:29:28.608033] I > [server-handshake.c:535:server_setvolume] bhl-volume-server: accepted > client from 128.128.164.215:725 > [2011-02-03 23:29:31.161985] I > [server-handshake.c:535:server_setvolume] bhl-volume-server: accepted > client from 128.128.164.74:697 > [2011-02-04 00:40:11.600314] I > [server-handshake.c:535:server_setvolume] bhl-volume-server: accepted > client from 128.128.164.74:805 > > Plus, looking at the tail of this log, it's still working, latest > messages (from 4 seconds before) as I'm moving some things on the > cluster > > [2011-02-04 23:13:35.53685] W [server-resolve.c:565:server_resolve] > bhl-volume-server: pure path resolution for > /www/d/dasobstdertropen00schrrich (INODELK) > [2011-02-04 23:13:35.57107] W [server-resolve.c:565:server_resolve] > bhl-volume-server: pure path resolution for > /www/d/dasobstdertropen00schrrich (SETXATTR) > [2011-02-04 23:13:35.59699] W [server-resolve.c:565:server_resolve] > bhl-volume-server: pure path resolution for > /www/d/dasobstdertropen00schrrich (INODELK) > > Thanks! > > P > > > >> >> On Fri, Feb 4, 2011 at 10:19 AM, phil cryer <phil at cryer.us> wrote: >>> >>> I have glusterfs 3.1.2 running on Debian, I'm able to start the volume >>> and now mount it via mount -t gluster and I can see everything. I am >>> still seeing the following error in /var/log/glusterfs/nfs.log >>> >>> [2011-02-04 13:09:16.404851] E >>> [client-handshake.c:1079:client_query_portmap_cbk] >>> bhl-volume-client-98: failed to get the port number for remote >>> subvolume >>> [2011-02-04 13:09:16.404909] I [client.c:1590:client_rpc_notify] >>> bhl-volume-client-98: disconnected >>> [2011-02-04 13:09:20.405843] E >>> [client-handshake.c:1079:client_query_portmap_cbk] >>> bhl-volume-client-98: failed to get the port number for remote >>> subvolume >>> [2011-02-04 13:09:20.405938] I [client.c:1590:client_rpc_notify] >>> bhl-volume-client-98: disconnected >>> [2011-02-04 13:09:24.406634] E >>> [client-handshake.c:1079:client_query_portmap_cbk] >>> bhl-volume-client-98: failed to get the port number for remote >>> subvolume >>> [2011-02-04 13:09:24.406711] I [client.c:1590:client_rpc_notify] >>> bhl-volume-client-98: disconnected >>> [2011-02-04 13:09:28.407249] E >>> [client-handshake.c:1079:client_query_portmap_cbk] >>> bhl-volume-client-98: failed to get the port number for remote >>> subvolume >>> [2011-02-04 13:09:28.407300] I [client.c:1590:client_rpc_notify] >>> bhl-volume-client-98: disconnected >>> >>> However, if I do a gluster volume info I see that it's listed: >>> # gluster volume info | grep 98 >>> Brick98: clustr-02:/mnt/data17 >>> >>> I've gone to that host, unmounted the specific drive, ran fsck.ext4 on >>> it, and it came back clean. Remounting and then restarting gluster on >>> all the nodes hasn't changed anything, I keep getting that error. >>> Also, I don't understand why it can't get the port number since it's >>> working fine on 23 other bricks (drives) on that server; leads me to >>> believe that it's not an accurate error. >>> >>> I searched the mailing lists and bug-tracker, and only found this similar >>> bug: >>> http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1640 >>> >>> Any idea what's going on? Is this just a benign error since the >>> cluster still seems to be working, or ? >>> >>> Thanks >>> >>> P >>> -- >>> http://philcryer.com >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users at gluster.org >>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users >> >> > > > > -- > http://philcryer.com > -- http://philcryer.com