On Fri, Feb 4, 2011 at 12:33 PM, Anand Avati <anand.avati at gmail.com> wrote: > It is very likely the brick process is failing to start. Please look at the > brick log on that server. (in /var/log/glusterfs/bricks/* ) > Avati Thanks, so if I'm looking at it right, the 'bhl-volume-client-98' is really Brick98: clustr-02:/mnt/data17 - I'm figuring that from this: >> [2011-02-04 13:09:28.407300] I [client.c:1590:client_rpc_notify] >> bhl-volume-client-98: disconnected >> >> However, if I do a gluster volume info I see that it's listed: >> # gluster volume info | grep 98 >> Brick98: clustr-02:/mnt/data17 But on that server I don't see any issues with that brick starting: # head mnt-data17.log -n50 [2011-02-03 23:29:24.235648] W [graph.c:274:gf_add_cmdline_options] bhl-volume-server: adding option 'listen-port' for volume 'bhl-volume-server' with value '24025' [2011-02-03 23:29:24.236017] W [rpc-transport.c:566:validate_volume_options] tcp.bhl-volume-server: option 'listen-port' is deprecated, preferred is 'transport.socket.listen-port', continuing with correction Given volfile: +------------------------------------------------------------------------------+ 1: volume bhl-volume-posix 2: type storage/posix 3: option directory /mnt/data17 4: end-volume 5: 6: volume bhl-volume-access-control 7: type features/access-control 8: subvolumes bhl-volume-posix 9: end-volume 10: 11: volume bhl-volume-locks 12: type features/locks 13: subvolumes bhl-volume-access-control 14: end-volume 15: 16: volume bhl-volume-io-threads 17: type performance/io-threads 18: subvolumes bhl-volume-locks 19: end-volume 20: 21: volume /mnt/data17 22: type debug/io-stats 23: subvolumes bhl-volume-io-threads 24: end-volume 25: 26: volume bhl-volume-server 27: type protocol/server 28: option transport-type tcp 29: option auth.addr./mnt/data17.allow * 30: subvolumes /mnt/data17 31: end-volume +------------------------------------------------------------------------------+ [2011-02-03 23:29:28.575630] I [server-handshake.c:535:server_setvolume] bhl-volume-server: accepted client from 128.128.164.219:724 [2011-02-03 23:29:28.583169] I [server-handshake.c:535:server_setvolume] bhl-volume-server: accepted client from 127.0.1.1:985 [2011-02-03 23:29:28.603357] I [server-handshake.c:535:server_setvolume] bhl-volume-server: accepted client from 128.128.164.218:726 [2011-02-03 23:29:28.605650] I [server-handshake.c:535:server_setvolume] bhl-volume-server: accepted client from 128.128.164.217:725 [2011-02-03 23:29:28.608033] I [server-handshake.c:535:server_setvolume] bhl-volume-server: accepted client from 128.128.164.215:725 [2011-02-03 23:29:31.161985] I [server-handshake.c:535:server_setvolume] bhl-volume-server: accepted client from 128.128.164.74:697 [2011-02-04 00:40:11.600314] I [server-handshake.c:535:server_setvolume] bhl-volume-server: accepted client from 128.128.164.74:805 Plus, looking at the tail of this log, it's still working, latest messages (from 4 seconds before) as I'm moving some things on the cluster [2011-02-04 23:13:35.53685] W [server-resolve.c:565:server_resolve] bhl-volume-server: pure path resolution for /www/d/dasobstdertropen00schrrich (INODELK) [2011-02-04 23:13:35.57107] W [server-resolve.c:565:server_resolve] bhl-volume-server: pure path resolution for /www/d/dasobstdertropen00schrrich (SETXATTR) [2011-02-04 23:13:35.59699] W [server-resolve.c:565:server_resolve] bhl-volume-server: pure path resolution for /www/d/dasobstdertropen00schrrich (INODELK) Thanks! P > > On Fri, Feb 4, 2011 at 10:19 AM, phil cryer <phil at cryer.us> wrote: >> >> I have glusterfs 3.1.2 running on Debian, I'm able to start the volume >> and now mount it via mount -t gluster and I can see everything. I am >> still seeing the following error in /var/log/glusterfs/nfs.log >> >> [2011-02-04 13:09:16.404851] E >> [client-handshake.c:1079:client_query_portmap_cbk] >> bhl-volume-client-98: failed to get the port number for remote >> subvolume >> [2011-02-04 13:09:16.404909] I [client.c:1590:client_rpc_notify] >> bhl-volume-client-98: disconnected >> [2011-02-04 13:09:20.405843] E >> [client-handshake.c:1079:client_query_portmap_cbk] >> bhl-volume-client-98: failed to get the port number for remote >> subvolume >> [2011-02-04 13:09:20.405938] I [client.c:1590:client_rpc_notify] >> bhl-volume-client-98: disconnected >> [2011-02-04 13:09:24.406634] E >> [client-handshake.c:1079:client_query_portmap_cbk] >> bhl-volume-client-98: failed to get the port number for remote >> subvolume >> [2011-02-04 13:09:24.406711] I [client.c:1590:client_rpc_notify] >> bhl-volume-client-98: disconnected >> [2011-02-04 13:09:28.407249] E >> [client-handshake.c:1079:client_query_portmap_cbk] >> bhl-volume-client-98: failed to get the port number for remote >> subvolume >> [2011-02-04 13:09:28.407300] I [client.c:1590:client_rpc_notify] >> bhl-volume-client-98: disconnected >> >> However, if I do a gluster volume info I see that it's listed: >> # gluster volume info | grep 98 >> Brick98: clustr-02:/mnt/data17 >> >> I've gone to that host, unmounted the specific drive, ran fsck.ext4 on >> it, and it came back clean. Remounting and then restarting gluster on >> all the nodes hasn't changed anything, I keep getting that error. >> Also, I don't understand why it can't get the port number since it's >> working fine on 23 other bricks (drives) on that server; leads me to >> believe that it's not an accurate error. >> >> I searched the mailing lists and bug-tracker, and only found this similar >> bug: >> http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1640 >> >> Any idea what's going on? Is this just a benign error since the >> cluster still seems to be working, or ? >> >> Thanks >> >> P >> -- >> http://philcryer.com >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users > > -- http://philcryer.com