yum install libibverbs tells me [root at storage0 ~]# yum install libibverbs Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile * addons: mirror.umoss.org * base: mirror.vcu.edu * extras: mirror.atlanticmetro.net * updates: holmes.umflint.edu Setting up Install Process Package libibverbs-1.1.3-2.el5.x86_64 already installed and latest version Package libibverbs-1.1.3-2.el5.i386 already installed and latest version Nothing to do Perhaps it is installed but turned off somehow? .. Lana (lana.deere at gmail.com) On Tue, Oct 19, 2010 at 7:15 PM, Craig Carl <craig at gluster.com> wrote: > Lana - > ?? Looks like you have the IPoIB stack installed, but not support for > ibverbs. Let's try this - > > # yum install libibverbs > # service glusterd restart > > Thanks, > > Craig > > -- > Craig Carl > Senior Systems Engineer; Gluster, Inc. > Cell - (408) 829-9953 (California, USA) > Office - (408) 770-1884 > Gtalk - craig.carl at gmail.com > Twitter - @gluster > Installing Gluster Storage Platform, the movie! > http://rackerhacker.com/2010/08/11/one-month-with-glusterfs-in-production/ > > > ________________________________ > From: "Lana Deere" <lana.deere at gmail.com> > To: "Craig Carl" <craig at gluster.com> > Cc: gluster-users at gluster.org, landman at scalableinformatics.com > Sent: Tuesday, October 19, 2010 4:02:11 PM > Subject: Re: hanging "df" (3.1, infiniband) > > They show up in ibhosts and I can ping or ssh via IPoIB to them, but > perhaps they are not completely configured properly. ?Or perhaps I > have mixed some references to the regular Ethernet into the > configuration for rdma? ?Anyway, here are the outputs you requested: > > [root at storage0 ~]# lsmod > Module ? ? ? ? ? ? ? ? ?Size ?Used by > iptable_filter ? ? ? ? 36161 ?0 > ip_tables ? ? ? ? ? ? ?55201 ?1 iptable_filter > x_tables ? ? ? ? ? ? ? 50505 ?1 ip_tables > fuse ? ? ? ? ? ? ? ? ? 83057 ?1 > autofs4 ? ? ? ? ? ? ? ?63049 ?3 > hidp ? ? ? ? ? ? ? ? ? 83521 ?2 > rfcomm ? ? ? ? ? ? ? ?104937 ?0 > l2cap ? ? ? ? ? ? ? ? ?89409 ?10 hidp,rfcomm > bluetooth ? ? ? ? ? ? 118853 ?5 hidp,rfcomm,l2cap > lockd ? ? ? ? ? ? ? ? 101553 ?0 > sunrpc ? ? ? ? ? ? ? ?199945 ?2 lockd > cpufreq_ondemand ? ? ? 42449 ?8 > acpi_cpufreq ? ? ? ? ? 47937 ?0 > freq_table ? ? ? ? ? ? 38977 ?2 cpufreq_ondemand,acpi_cpufreq > ib_iser ? ? ? ? ? ? ? ?69569 ?0 > libiscsi2 ? ? ? ? ? ? ?77765 ?1 ib_iser > scsi_transport_iscsi2 ? ?74073 ?2 ib_iser,libiscsi2 > scsi_transport_iscsi ? ?35017 ?1 scsi_transport_iscsi2 > ib_srp ? ? ? ? ? ? ? ? 67465 ?0 > rds ? ? ? ? ? ? ? ? ? 401393 ?0 > ib_sdp ? ? ? ? ? ? ? ?144285 ?0 > ib_ipoib ? ? ? ? ? ? ?113057 ?0 > ipoib_helper ? ? ? ? ? 35537 ?2 ib_ipoib > ipv6 ? ? ? ? ? ? ? ? ?435489 ?77 ib_ipoib > xfrm_nalgo ? ? ? ? ? ? 43333 ?1 ipv6 > crypto_api ? ? ? ? ? ? 42945 ?1 xfrm_nalgo > rdma_ucm ? ? ? ? ? ? ? 47681 ?0 > rdma_cm ? ? ? ? ? ? ? ?68437 ?4 ib_iser,rds,ib_sdp,rdma_ucm > ib_ucm ? ? ? ? ? ? ? ? 50121 ?0 > ib_uverbs ? ? ? ? ? ? ?68720 ?2 rdma_ucm,ib_ucm > ib_umad ? ? ? ? ? ? ? ?50153 ?0 > ib_cm ? ? ? ? ? ? ? ? ?72809 ?4 ib_srp,ib_ipoib,rdma_cm,ib_ucm > iw_cm ? ? ? ? ? ? ? ? ?43465 ?1 rdma_cm > ib_addr ? ? ? ? ? ? ? ?41929 ?1 rdma_cm > ib_sa ? ? ? ? ? ? ? ? ?74953 ?4 ib_srp,ib_ipoib,rdma_cm,ib_cm > mlx4_ib ? ? ? ? ? ? ? ?94461 ?0 > ib_mad ? ? ? ? ? ? ? ? 70629 ?4 ib_umad,ib_cm,ib_sa,mlx4_ib > ib_core ? ? ? ? ? ? ? 104901 ?15 > ib_iser,ib_srp,rds,ib_sdp,ib_ipoib,rdma_ucm,rdma_cm,ib_ucm,ib_uverbs,ib_umad,ib_cm,iw_cm,ib_sa,mlx4_ib,ib_mad > xfs ? ? ? ? ? ? ? ? ? 508625 ?1 > loop ? ? ? ? ? ? ? ? ? 48721 ?0 > dm_mirror ? ? ? ? ? ? ?54737 ?0 > dm_multipath ? ? ? ? ? 56921 ?0 > scsi_dh ? ? ? ? ? ? ? ?42177 ?1 dm_multipath > raid456 ? ? ? ? ? ? ? 152417 ?1 > xor ? ? ? ? ? ? ? ? ? ?38865 ?1 raid456 > video ? ? ? ? ? ? ? ? ?53197 ?0 > backlight ? ? ? ? ? ? ?39873 ?1 video > sbs ? ? ? ? ? ? ? ? ? ?49921 ?0 > power_meter ? ? ? ? ? ?47053 ?0 > hwmon ? ? ? ? ? ? ? ? ?36553 ?1 power_meter > i2c_ec ? ? ? ? ? ? ? ? 38593 ?1 sbs > dell_wmi ? ? ? ? ? ? ? 37601 ?0 > wmi ? ? ? ? ? ? ? ? ? ?41985 ?1 dell_wmi > button ? ? ? ? ? ? ? ? 40545 ?0 > battery ? ? ? ? ? ? ? ?43849 ?0 > asus_acpi ? ? ? ? ? ? ?50917 ?0 > acpi_memhotplug ? ? ? ?40517 ?0 > ac ? ? ? ? ? ? ? ? ? ? 38729 ?0 > parport_pc ? ? ? ? ? ? 62313 ?0 > lp ? ? ? ? ? ? ? ? ? ? 47121 ?0 > parport ? ? ? ? ? ? ? ?73165 ?2 parport_pc,lp > mlx4_en ? ? ? ? ? ? ? 107985 ?0 > joydev ? ? ? ? ? ? ? ? 43969 ?0 > i2c_i801 ? ? ? ? ? ? ? 41813 ?0 > igb ? ? ? ? ? ? ? ? ? 122709 ?0 > i2c_core ? ? ? ? ? ? ? 56641 ?2 i2c_ec,i2c_i801 > 8021q ? ? ? ? ? ? ? ? ?57425 ?1 igb > shpchp ? ? ? ? ? ? ? ? 70893 ?0 > mlx4_core ? ? ? ? ? ? 152773 ?2 mlx4_ib,mlx4_en > serio_raw ? ? ? ? ? ? ?40517 ?0 > dca ? ? ? ? ? ? ? ? ? ?41221 ?1 igb > sg ? ? ? ? ? ? ? ? ? ? 70377 ?0 > pcspkr ? ? ? ? ? ? ? ? 36289 ?0 > dm_raid45 ? ? ? ? ? ? ?99657 ?0 > dm_message ? ? ? ? ? ? 36289 ?1 dm_raid45 > dm_region_hash ? ? ? ? 46145 ?1 dm_raid45 > dm_log ? ? ? ? ? ? ? ? 44993 ?3 dm_mirror,dm_raid45,dm_region_hash > dm_mod ? ? ? ? ? ? ? ?101649 ?4 dm_mirror,dm_multipath,dm_raid45,dm_log > dm_mem_cache ? ? ? ? ? 38977 ?1 dm_raid45 > mpt2sas ? ? ? ? ? ? ? 159337 ?12 > scsi_transport_sas ? ? 66753 ?1 mpt2sas > ahci ? ? ? ? ? ? ? ? ? 69705 ?6 > libata ? ? ? ? ? ? ? ?209489 ?1 ahci > sd_mod ? ? ? ? ? ? ? ? 56513 ?32 > scsi_mod ? ? ? ? ? ? ?196953 ?10 > ib_iser,libiscsi2,scsi_transport_iscsi2,ib_srp,scsi_dh,sg,mpt2sas,scsi_transport_sas,libata,sd_mod > raid1 ? ? ? ? ? ? ? ? ?56001 ?3 > ext3 ? ? ? ? ? ? ? ? ?168913 ?2 > jbd ? ? ? ? ? ? ? ? ? ?94769 ?1 ext3 > uhci_hcd ? ? ? ? ? ? ? 57433 ?0 > ohci_hcd ? ? ? ? ? ? ? 56309 ?0 > ehci_hcd ? ? ? ? ? ? ? 66125 ?0 > [root at storage0 ~]# ibv_devinfo > libibverbs: Warning: no userspace device-specific driver found for > /sys/class/infiniband_verbs/uverbs0 > No IB devices found > [root at storage0 ~]# lspci > 00:00.0 Host bridge: Intel Corporation 5520 I/O Hub to ESI Port (rev 22) > 00:01.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI > Express Root Port 1 (rev 22) > 00:03.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI > Express Root Port 3 (rev 22) > 00:05.0 PCI bridge: Intel Corporation 5520/X58 I/O Hub PCI Express > Root Port 5 (rev 22) > 00:07.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI > Express Root Port 7 (rev 22) > 00:09.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI > Express Root Port 9 (rev 22) > 00:13.0 PIC: Intel Corporation 5520/5500/X58 I/O Hub I/OxAPIC > Interrupt Controller (rev 22) > 00:14.0 PIC: Intel Corporation 5520/5500/X58 I/O Hub System Management > Registers (rev 22) > 00:14.1 PIC: Intel Corporation 5520/5500/X58 I/O Hub GPIO and Scratch > Pad Registers (rev 22) > 00:14.2 PIC: Intel Corporation 5520/5500/X58 I/O Hub Control Status > and RAS Registers (rev 22) > 00:14.3 PIC: Intel Corporation 5520/5500/X58 I/O Hub Throttle Registers (rev > 22) > 00:16.0 System peripheral: Intel Corporation 5520/5500/X58 Chipset > QuickData Technology Device (rev 22) > 00:16.1 System peripheral: Intel Corporation 5520/5500/X58 Chipset > QuickData Technology Device (rev 22) > 00:16.2 System peripheral: Intel Corporation 5520/5500/X58 Chipset > QuickData Technology Device (rev 22) > 00:16.3 System peripheral: Intel Corporation 5520/5500/X58 Chipset > QuickData Technology Device (rev 22) > 00:16.4 System peripheral: Intel Corporation 5520/5500/X58 Chipset > QuickData Technology Device (rev 22) > 00:16.5 System peripheral: Intel Corporation 5520/5500/X58 Chipset > QuickData Technology Device (rev 22) > 00:16.6 System peripheral: Intel Corporation 5520/5500/X58 Chipset > QuickData Technology Device (rev 22) > 00:16.7 System peripheral: Intel Corporation 5520/5500/X58 Chipset > QuickData Technology Device (rev 22) > 00:1a.0 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB > UHCI Controller #4 > 00:1a.1 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB > UHCI Controller #5 > 00:1a.2 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB > UHCI Controller #6 > 00:1a.7 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB2 > EHCI Controller #2 > 00:1d.0 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB > UHCI Controller #1 > 00:1d.1 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB > UHCI Controller #2 > 00:1d.2 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB > UHCI Controller #3 > 00:1d.7 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB2 > EHCI Controller #1 > 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 90) > 00:1f.0 ISA bridge: Intel Corporation 82801JIR (ICH10R) LPC Interface > Controller > 00:1f.2 SATA controller: Intel Corporation 82801JI (ICH10 Family) SATA > AHCI Controller > 00:1f.3 SMBus: Intel Corporation 82801JI (ICH10 Family) SMBus Controller > 01:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network > Connection (rev 01) > 01:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network > Connection (rev 01) > 02:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic > SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 02) > 05:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe > 2.0 5GT/s - IB QDR / 10GigE] (rev b0) > 06:01.0 VGA compatible controller: Matrox Graphics, Inc. MGA G200eW > WPCM450 (rev 0a) > [root at storage0 ~]# /etc/init.d/openibd status > Low level hardware support loaded: > ?? ? ? ?mlx4_ib > > Upper layer protocol modules: > ?? ? ? ?ib_iser ib_srp rds ib_sdp ib_ipoib > > User space access modules: > ?? ? ? ?rdma_ucm ib_ucm ib_uverbs ib_umad > > Connection management modules: > ?? ? ? ?rdma_cm ib_cm iw_cm > > Configured IPoIB interfaces: ib0 > Currently active IPoIB interfaces: ib0 > [root at storage0 ~]# > > > > .. Lana (lana.deere at gmail.com) > > > > > > > On Tue, Oct 19, 2010 at 6:48 PM, Craig Carl <craig at gluster.com> wrote: >> Lana - >> ?The first couple of lines of the log identify our problem - >> >> [2010-10-19 07:47:49.315416] C [rdma.c:3817:rdma_init] rpc-transport/rdma: >> No IB devices found >> [2010-10-19 07:47:49.315438] E [rdma.c:4744:init] rdma.management: Failed >> to >> initialize IB Device >> [2010-10-19 07:47:49.315452] E [rpc-transport.c:965:rpc_transport_load] >> rpc-transport: 'rdma' initialization failed >> >> Are you sure your IB cards are working? Can you send the output of - >> >> # lsmod >> # ibv_devinfo >> # lspci >> # /etc/init.d/openibd status >> >> >> >> Thanks, >> >> Craig >> >> -- >> Craig Carl >> Senior Systems Engineer; Gluster, Inc. >> Cell - (408) 829-9953 (California, USA) >> Office - (408) 770-1884 >> Gtalk - craig.carl at gmail.com >> Twitter - @gluster >> Installing Gluster Storage Platform, the movie! >> http://rackerhacker.com/2010/08/11/one-month-with-glusterfs-in-production/ >> >> >> ________________________________ >> From: "Lana Deere" <lana.deere at gmail.com> >> To: "Craig Carl" <craig at gluster.com> >> Cc: gluster-users at gluster.org, landman at scalableinformatics.com >> Sent: Tuesday, October 19, 2010 3:29:41 PM >> Subject: Re: hanging "df" (3.1, infiniband) >> >> For the last little while I've been using storage0 as both client and >> server, so those files are both client and server files at the same >> time. ?If it would be helpful, I could go back to using a different >> host as client (but then 'df' will hang instead of reporting the >> Transport message). >> >> [root at storage0 ~]# cat /etc/glusterd/.cmd_log_history >> [2010-10-19 07:54:36.244333] peer probe : ?on host storage1:24007 >> [2010-10-19 07:54:36.249891] peer probe : on host storage1:24007 FAILED >> [2010-10-19 07:54:43.745558] peer probe : ?on host storage2:24007 >> [2010-10-19 07:54:43.750752] peer probe : on host storage2:24007 FAILED >> [2010-10-19 07:54:48.915378] peer probe : ?on host storage3:24007 >> [2010-10-19 07:54:48.920595] peer probe : on host storage3:24007 FAILED >> [2010-10-19 07:59:49.737251] Volume create : on volname: RaidData >> attempted >> [2010-10-19 07:59:49.737314] Volume create : on volname: RaidData >> type:DEFAULT count:4 bricks: storage0:/data storage1:/data >> storage2:/data storage3:/data >> [2010-10-19 07:59:49.737631] Volume create : on volname: RaidData SUCCESS >> [2010-10-19 08:01:36.909963] volume start : on volname: RaidData SUCCESS >> >> The /var/log file was pretty big, so I put it on pastebin: >> ?? ?http://pastebin.com/m6WbHPUp >> >> >> .. Lana (lana.deere at gmail.com) >> >> >> >> >> >> >> On Tue, Oct 19, 2010 at 6:10 PM, Craig Carl <craig at gluster.com> wrote: >>> Lana - >>> ?? Can you also post the contents of >>> >>> /var/log/glusterfs/etc-glusterfs-glusterd.vol.log >>> and >>> /etc/glusterd/.cmd_log_history >>> >>> on both the client and server to the list? >> >