I had some difficulty getting OFED 1.3 working on kernel 2.6.27 about 6 months back. It took some patching but I did find that you needed to have the srq enabled for it to work. The ibv_srq_pingpong test app was a good test for weather it would work with gluster of not. I also had to upgrade the firmware on the mellanox cards I have to enable srq (send recieve que) -Mic Nathan Stratton wrote: > > Hate to post again, but anyone have any ideas on this? > > -Nathan > > On Fri, 18 Sep 2009, Nathan Stratton wrote: > >> >> Has anyone been able to get Infiniband working with 2.6.31 kernel and >> fuse 2.8.0? My config works fine on my Centos 2.6.18 box, so I know >> that is ok. >> >> Infiniband looks good: >> >> [root at xen1 src]# lsmod |grep ib >> ib_ucm 13752 0 >> ib_uverbs 32256 2 rdma_ucm,ib_ucm >> ib_ipoib 68880 0 >> ib_mthca 123700 0 >> >> [root at xen1 src]# ibv_devices >> device node GUID >> ------ ---------------- >> mthca0 0005ad00000327e8 >> >> Gluster looks like it starts OK, but I can't touch the mount and >> after a while it times out. Debug logs: >> >> >> [2009-09-18 19:36:17] D [glusterfsd.c:354:_get_specfp] glusterfs: >> loading volume file /usr/local/etc/glusterfs/glusterfs.vol >> ================================================================================ >> >> Version : glusterfs 2.0.6 built on Sep 18 2009 09:54:43 >> TLA Revision : v2.0.6 >> Starting Time: 2009-09-18 19:36:17 >> Command line : glusterfs -L DEBUG -l /var/log/glusterfs.log >> --disable-direct-io-mode /share >> PID : 8303 >> System name : Linux >> Nodename : xen1.hou.blinkmind.com >> Kernel Release : 2.6.31 >> Hardware Identifier: x86_64 >> >> Given volfile: >> +------------------------------------------------------------------------------+ >> >> 1: volume brick0 >> 2: type protocol/client >> 3: option transport-type ib-verbs/client >> 4: option remote-host 172.16.0.200 >> 5: option remote-port 6997 >> 6: option transport.address-family inet/inet6 >> 7: option remote-subvolume brick >> 8: end-volume >> 9: >> 10: volume mirror0 >> 11: type protocol/client >> 12: option transport-type ib-verbs/client >> 13: option remote-host 172.16.0.201 >> 14: option remote-port 6997 >> 15: option transport.address-family inet/inet6 >> 16: option remote-subvolume brick >> 17: end-volume >> 18: >> 19: volume brick1 >> 20: type protocol/client >> 21: option transport-type ib-verbs/client >> 22: option remote-host 172.16.0.202 >> 23: option remote-port 6997 >> 24: option transport.address-family inet/inet6 >> 25: option remote-subvolume brick >> 26: end-volume >> 27: >> 28: volume mirror1 >> 29: type protocol/client >> 30: option transport-type ib-verbs/client >> 31: option remote-host 172.16.0.203 >> 32: option remote-port 6997 >> 33: option transport.address-family inet/inet6 >> 34: option remote-subvolume brick >> 35: end-volume >> 36: >> 37: volume brick2 >> 38: type protocol/client >> 39: option transport-type ib-verbs/client >> 40: option remote-host 172.16.0.204 >> 41: option remote-port 6997 >> 42: option transport.address-family inet/inet6 >> 43: option remote-subvolume brick >> 44: end-volume >> 45: >> 46: volume mirror2 >> 47: type protocol/client >> 48: option transport-type ib-verbs/client >> 49: option remote-host 172.16.0.205 >> 50: option remote-port 6997 >> 51: option transport.address-family inet/inet6 >> 52: option remote-subvolume brick >> 53: end-volume >> 54: >> 55: volume block0 >> 56: type cluster/replicate >> 57: subvolumes brick0 mirror0 >> 58: end-volume >> 59: >> 60: volume block1 >> 61: type cluster/replicate >> 62: subvolumes brick1 mirror1 >> 63: end-volume >> 64: >> 65: volume block2 >> 66: type cluster/replicate >> 67: subvolumes brick2 mirror2 >> 68: end-volume >> 69: >> 70: volume unify >> 71: type cluster/distribute >> 72: subvolumes block0 block1 block2 >> 73: end-volume >> 74: >> >> +------------------------------------------------------------------------------+ >> >> [2009-09-18 19:36:17] D [glusterfsd.c:1205:main] glusterfs: running >> in pid 8303 >> [2009-09-18 19:36:17] D [client-protocol.c:5952:init] brick0: >> defaulting frame-timeout to 30mins >> [2009-09-18 19:36:17] D [client-protocol.c:5963:init] brick0: >> defaulting ping-timeout to 10 >> [2009-09-18 19:36:17] D [transport.c:141:transport_load] transport: >> attempt to load file >> /usr/local/lib/glusterfs/2.0.6/transport/ib-verbs.so >> [2009-09-18 19:36:17] D [xlator.c:276:_volume_option_value_validate] >> brick0: no range check required for 'option remote-port 6997' >> [2009-09-18 19:36:17] D [transport.c:141:transport_load] transport: >> attempt to load file >> /usr/local/lib/glusterfs/2.0.6/transport/ib-verbs.so >> [2009-09-18 19:36:17] D [xlator.c:276:_volume_option_value_validate] >> brick0: no range check required for 'option remote-port 6997' >> [2009-09-18 19:36:17] D [client-protocol.c:5952:init] mirror0: >> defaulting frame-timeout to 30mins >> [2009-09-18 19:36:17] D [client-protocol.c:5963:init] mirror0: >> defaulting ping-timeout to 10 >> [2009-09-18 19:36:17] D [transport.c:141:transport_load] transport: >> attempt to load file >> /usr/local/lib/glusterfs/2.0.6/transport/ib-verbs.so >> [2009-09-18 19:36:17] D [xlator.c:276:_volume_option_value_validate] >> mirror0: no range check required for 'option remote-port 6997' >> [2009-09-18 19:36:17] D [transport.c:141:transport_load] transport: >> attempt to load file >> /usr/local/lib/glusterfs/2.0.6/transport/ib-verbs.so >> [2009-09-18 19:36:17] D [xlator.c:276:_volume_option_value_validate] >> mirror0: no range check required for 'option remote-port 6997' >> [2009-09-18 19:36:17] D [client-protocol.c:5952:init] brick1: >> defaulting frame-timeout to 30mins >> [2009-09-18 19:36:17] D [client-protocol.c:5963:init] brick1: >> defaulting ping-timeout to 10 >> [2009-09-18 19:36:17] D [transport.c:141:transport_load] transport: >> attempt to load file >> /usr/local/lib/glusterfs/2.0.6/transport/ib-verbs.so >> [2009-09-18 19:36:17] D [xlator.c:276:_volume_option_value_validate] >> brick1: no range check required for 'option remote-port 6997' >> [2009-09-18 19:36:17] D [transport.c:141:transport_load] transport: >> attempt to load file >> /usr/local/lib/glusterfs/2.0.6/transport/ib-verbs.so >> [2009-09-18 19:36:17] D [xlator.c:276:_volume_option_value_validate] >> brick1: no range check required for 'option remote-port 6997' >> [2009-09-18 19:36:17] D [client-protocol.c:5952:init] mirror1: >> defaulting frame-timeout to 30mins >> [2009-09-18 19:36:17] D [client-protocol.c:5963:init] mirror1: >> defaulting ping-timeout to 10 >> [2009-09-18 19:36:17] D [transport.c:141:transport_load] transport: >> attempt to load file >> /usr/local/lib/glusterfs/2.0.6/transport/ib-verbs.so >> [2009-09-18 19:36:17] D [xlator.c:276:_volume_option_value_validate] >> mirror1: no range check required for 'option remote-port 6997' >> [2009-09-18 19:36:17] D [transport.c:141:transport_load] transport: >> attempt to load file >> /usr/local/lib/glusterfs/2.0.6/transport/ib-verbs.so >> [2009-09-18 19:36:17] D [xlator.c:276:_volume_option_value_validate] >> mirror1: no range check required for 'option remote-port 6997' >> [2009-09-18 19:36:17] D [client-protocol.c:5952:init] brick2: >> defaulting frame-timeout to 30mins >> [2009-09-18 19:36:17] D [client-protocol.c:5963:init] brick2: >> defaulting ping-timeout to 10 >> [2009-09-18 19:36:17] D [transport.c:141:transport_load] transport: >> attempt to load file >> /usr/local/lib/glusterfs/2.0.6/transport/ib-verbs.so >> [2009-09-18 19:36:17] D [xlator.c:276:_volume_option_value_validate] >> brick2: no range check required for 'option remote-port 6997' >> [2009-09-18 19:36:17] D [transport.c:141:transport_load] transport: >> attempt to load file >> /usr/local/lib/glusterfs/2.0.6/transport/ib-verbs.so >> [2009-09-18 19:36:17] D [xlator.c:276:_volume_option_value_validate] >> brick2: no range check required for 'option remote-port 6997' >> [2009-09-18 19:36:17] D [client-protocol.c:5952:init] mirror2: >> defaulting frame-timeout to 30mins >> [2009-09-18 19:36:17] D [client-protocol.c:5963:init] mirror2: >> defaulting ping-timeout to 10 >> [2009-09-18 19:36:17] D [transport.c:141:transport_load] transport: >> attempt to load file >> /usr/local/lib/glusterfs/2.0.6/transport/ib-verbs.so >> [2009-09-18 19:36:17] D [xlator.c:276:_volume_option_value_validate] >> mirror2: no range check required for 'option remote-port 6997' >> [2009-09-18 19:36:17] D [transport.c:141:transport_load] transport: >> attempt to load file >> /usr/local/lib/glusterfs/2.0.6/transport/ib-verbs.so >> [2009-09-18 19:36:17] D [xlator.c:276:_volume_option_value_validate] >> mirror2: no range check required for 'option remote-port 6997' >> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] brick0: got >> GF_EVENT_PARENT_UP, attempting connect on transport >> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] brick0: got >> GF_EVENT_PARENT_UP, attempting connect on transport >> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] mirror0: got >> GF_EVENT_PARENT_UP, attempting connect on transport >> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] mirror0: got >> GF_EVENT_PARENT_UP, attempting connect on transport >> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] brick1: got >> GF_EVENT_PARENT_UP, attempting connect on transport >> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] brick1: got >> GF_EVENT_PARENT_UP, attempting connect on transport >> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] mirror1: got >> GF_EVENT_PARENT_UP, attempting connect on transport >> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] mirror1: got >> GF_EVENT_PARENT_UP, attempting connect on transport >> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] brick2: got >> GF_EVENT_PARENT_UP, attempting connect on transport >> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] brick2: got >> GF_EVENT_PARENT_UP, attempting connect on transport >> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] mirror2: got >> GF_EVENT_PARENT_UP, attempting connect on transport >> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] mirror2: got >> GF_EVENT_PARENT_UP, attempting connect on transport >> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] brick0: got >> GF_EVENT_PARENT_UP, attempting connect on transport >> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] brick0: got >> GF_EVENT_PARENT_UP, attempting connect on transport >> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] mirror0: got >> GF_EVENT_PARENT_UP, attempting connect on transport >> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] mirror0: got >> GF_EVENT_PARENT_UP, attempting connect on transport >> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] brick1: got >> GF_EVENT_PARENT_UP, attempting connect on transport >> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] brick1: got >> GF_EVENT_PARENT_UP, attempting connect on transport >> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] mirror1: got >> GF_EVENT_PARENT_UP, attempting connect on transport >> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] mirror1: got >> GF_EVENT_PARENT_UP, attempting connect on transport >> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] brick2: got >> GF_EVENT_PARENT_UP, attempting connect on transport >> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] brick2: got >> GF_EVENT_PARENT_UP, attempting connect on transport >> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] mirror2: got >> GF_EVENT_PARENT_UP, attempting connect on transport >> [2009-09-18 19:36:17] D [client-protocol.c:6280:notify] mirror2: got >> GF_EVENT_PARENT_UP, attempting connect on transport >> [2009-09-18 19:36:17] N [glusterfsd.c:1224:main] glusterfs: >> Successfully started >> [2009-09-18 19:36:17] D [client-protocol.c:6294:notify] brick0: got >> GF_EVENT_CHILD_UP >> [2009-09-18 19:36:17] D [client-protocol.c:6294:notify] brick0: got >> GF_EVENT_CHILD_UP >> [2009-09-18 19:36:17] D [client-protocol.c:6294:notify] mirror0: got >> GF_EVENT_CHILD_UP >> [2009-09-18 19:36:17] D [client-protocol.c:6294:notify] mirror0: got >> GF_EVENT_CHILD_UP >> [2009-09-18 19:36:17] D [client-protocol.c:6294:notify] brick1: got >> GF_EVENT_CHILD_UP >> [2009-09-18 19:36:17] D [client-protocol.c:6294:notify] brick1: got >> GF_EVENT_CHILD_UP >> [2009-09-18 19:36:17] D [client-protocol.c:6294:notify] mirror1: got >> GF_EVENT_CHILD_UP >> [2009-09-18 19:36:17] D [client-protocol.c:6294:notify] mirror1: got >> GF_EVENT_CHILD_UP >> [2009-09-18 19:36:17] D [client-protocol.c:6294:notify] brick2: got >> GF_EVENT_CHILD_UP >> [2009-09-18 19:36:17] D [client-protocol.c:6294:notify] brick2: got >> GF_EVENT_CHILD_UP >> [2009-09-18 19:36:17] D [client-protocol.c:6294:notify] mirror2: got >> GF_EVENT_CHILD_UP >> [2009-09-18 19:36:17] D [client-protocol.c:6294:notify] mirror2: got >> GF_EVENT_CHILD_UP >> [2009-09-18 20:06:18] E [client-protocol.c:289:call_bail] brick0: >> bailing out frame SETVOLUME(0) frame sent = 2009-09-18 19:36:17. >> frame-timeout = 1800 >> [2009-09-18 20:06:18] D [client-protocol.c:5491:client_setvolume_cbk] >> brick0: setvolume failed (Transport endpoint is not connected) >> [2009-09-18 20:06:18] E [client-protocol.c:289:call_bail] brick0: >> bailing out frame SETVOLUME(0) frame sent = 2009-09-18 19:36:17. >> frame-timeout = 1800 >> [2009-09-18 20:06:18] D [client-protocol.c:5491:client_setvolume_cbk] >> brick0: setvolume failed (Transport endpoint is not connected) >> [2009-09-18 20:06:18] E [client-protocol.c:289:call_bail] mirror0: >> bailing out frame SETVOLUME(0) frame sent = 2009-09-18 19:36:17. >> frame-timeout = 1800 >> [2009-09-18 20:06:18] D [client-protocol.c:5491:client_setvolume_cbk] >> mirror0: setvolume failed (Transport endpoint is not connected) >> [2009-09-18 20:06:18] E [client-protocol.c:289:call_bail] mirror0: >> bailing out frame SETVOLUME(0) frame sent = 2009-09-18 19:36:17. >> frame-timeout = 1800 >> [2009-09-18 20:06:18] D [client-protocol.c:5491:client_setvolume_cbk] >> mirror0: setvolume failed (Transport endpoint is not connected) >> [2009-09-18 20:06:18] E [client-protocol.c:289:call_bail] brick1: >> bailing out frame SETVOLUME(0) frame sent = 2009-09-18 19:36:17. >> frame-timeout = 1800 >> [2009-09-18 20:06:18] D [client-protocol.c:5491:client_setvolume_cbk] >> brick1: setvolume failed (Transport endpoint is not connected) >> [2009-09-18 20:06:18] E [client-protocol.c:289:call_bail] brick1: >> bailing out frame SETVOLUME(0) frame sent = 2009-09-18 19:36:17. >> frame-timeout = 1800 >> [2009-09-18 20:06:18] D [client-protocol.c:5491:client_setvolume_cbk] >> brick1: setvolume failed (Transport endpoint is not connected) >> [2009-09-18 20:06:18] E [client-protocol.c:289:call_bail] mirror1: >> bailing out frame SETVOLUME(0) frame sent = 2009-09-18 19:36:17. >> frame-timeout = 1800 >> [2009-09-18 20:06:18] D [client-protocol.c:5491:client_setvolume_cbk] >> mirror1: setvolume failed (Transport endpoint is not connected) >> [2009-09-18 20:06:18] E [client-protocol.c:289:call_bail] mirror1: >> bailing out frame SETVOLUME(0) frame sent = 2009-09-18 19:36:17. >> frame-timeout = 1800 >> [2009-09-18 20:06:18] D [client-protocol.c:5491:client_setvolume_cbk] >> mirror1: setvolume failed (Transport endpoint is not connected) >> [2009-09-18 20:06:18] E [client-protocol.c:289:call_bail] brick2: >> bailing out frame SETVOLUME(0) frame sent = 2009-09-18 19:36:17. >> frame-timeout = 1800 >> [2009-09-18 20:06:18] D [client-protocol.c:5491:client_setvolume_cbk] >> brick2: setvolume failed (Transport endpoint is not connected) >> [2009-09-18 20:06:18] E [client-protocol.c:289:call_bail] brick2: >> bailing out frame SETVOLUME(0) frame sent = 2009-09-18 19:36:17. >> frame-timeout = 1800 >> [2009-09-18 20:06:18] D [client-protocol.c:5491:client_setvolume_cbk] >> brick2: setvolume failed (Transport endpoint is not connected) >> [2009-09-18 20:06:18] E [client-protocol.c:289:call_bail] mirror2: >> bailing out frame SETVOLUME(0) frame sent = 2009-09-18 19:36:17. >> frame-timeout = 1800 >> [2009-09-18 20:06:18] D [client-protocol.c:5491:client_setvolume_cbk] >> mirror2: setvolume failed (Transport endpoint is not connected) >> [2009-09-18 20:06:18] E [client-protocol.c:289:call_bail] mirror2: >> bailing out frame SETVOLUME(0) frame sent = 2009-09-18 19:36:17. >> frame-timeout = 1800 >> [2009-09-18 20:06:18] D [client-protocol.c:5491:client_setvolume_cbk] >> mirror2: setvolume failed (Transport endpoint is not connected) >> [2009-09-18 20:06:18] D [dht-common.c:820:dht_lookup] unify: no >> subvolume in layout for path=/, checking on all the subvols to see if >> it is a directory >> [2009-09-18 20:06:18] D [dht-common.c:113:dht_lookup_dir_cbk] unify: >> lookup of / on block0 returned error (Transport endpoint is not >> connected) >> [2009-09-18 20:06:18] D [dht-common.c:113:dht_lookup_dir_cbk] unify: >> lookup of / on block1 returned error (Transport endpoint is not >> connected) >> [2009-09-18 20:06:18] D [dht-common.c:113:dht_lookup_dir_cbk] unify: >> lookup of / on block2 returned error (Transport endpoint is not >> connected) >> [2009-09-18 20:06:18] D [fuse-bridge.c:2385:fuse_root_lookup_cbk] >> fuse: first lookup on root failed. >> [2009-09-18 20:06:18] W [fuse-bridge.c:1841:fuse_statfs_cbk] >> glusterfs-fuse: 2: ERR => -1 (Transport endpoint is not connected) >> >> >> >>> <> >> Nathan Stratton CTO, BlinkMind, Inc. >> nathan at robotics.net nathan at blinkmind.com >> http://www.robotics.net http://www.blinkmind.com >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users >> > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users