hanging "df" (3.1, infiniband)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Lana - 
Looks like you have the IPoIB stack installed, but not support for ibverbs. Let's try this - 

# yum install libibverbs 
# service glusterd restart 



Thanks, 

Craig 

-- 
Craig Carl 













Senior Systems Engineer; Gluster, Inc. 
Cell - ( 408) 829-9953 (California, USA) 
Office - ( 408) 770-1884 
Gtalk - craig.carl at gmail.com 
Twitter - @gluster 
Installing Gluster Storage Platform, the movie! 
http://rackerhacker.com/2010/08/11/one-month-with-glusterfs-in-production/ 



From: "Lana Deere" <lana.deere at gmail.com> 
To: "Craig Carl" <craig at gluster.com> 
Cc: gluster-users at gluster.org, landman at scalableinformatics.com 
Sent: Tuesday, October 19, 2010 4:02:11 PM 
Subject: Re: hanging "df" (3.1, infiniband) 

They show up in ibhosts and I can ping or ssh via IPoIB to them, but 
perhaps they are not completely configured properly. Or perhaps I 
have mixed some references to the regular Ethernet into the 
configuration for rdma? Anyway, here are the outputs you requested: 

[root at storage0 ~]# lsmod 
Module Size Used by 
iptable_filter 36161 0 
ip_tables 55201 1 iptable_filter 
x_tables 50505 1 ip_tables 
fuse 83057 1 
autofs4 63049 3 
hidp 83521 2 
rfcomm 104937 0 
l2cap 89409 10 hidp,rfcomm 
bluetooth 118853 5 hidp,rfcomm,l2cap 
lockd 101553 0 
sunrpc 199945 2 lockd 
cpufreq_ondemand 42449 8 
acpi_cpufreq 47937 0 
freq_table 38977 2 cpufreq_ondemand,acpi_cpufreq 
ib_iser 69569 0 
libiscsi2 77765 1 ib_iser 
scsi_transport_iscsi2 74073 2 ib_iser,libiscsi2 
scsi_transport_iscsi 35017 1 scsi_transport_iscsi2 
ib_srp 67465 0 
rds 401393 0 
ib_sdp 144285 0 
ib_ipoib 113057 0 
ipoib_helper 35537 2 ib_ipoib 
ipv6 435489 77 ib_ipoib 
xfrm_nalgo 43333 1 ipv6 
crypto_api 42945 1 xfrm_nalgo 
rdma_ucm 47681 0 
rdma_cm 68437 4 ib_iser,rds,ib_sdp,rdma_ucm 
ib_ucm 50121 0 
ib_uverbs 68720 2 rdma_ucm,ib_ucm 
ib_umad 50153 0 
ib_cm 72809 4 ib_srp,ib_ipoib,rdma_cm,ib_ucm 
iw_cm 43465 1 rdma_cm 
ib_addr 41929 1 rdma_cm 
ib_sa 74953 4 ib_srp,ib_ipoib,rdma_cm,ib_cm 
mlx4_ib 94461 0 
ib_mad 70629 4 ib_umad,ib_cm,ib_sa,mlx4_ib 
ib_core 104901 15 
ib_iser,ib_srp,rds,ib_sdp,ib_ipoib,rdma_ucm,rdma_cm,ib_ucm,ib_uverbs,ib_umad,ib_cm,iw_cm,ib_sa,mlx4_ib,ib_mad 
xfs 508625 1 
loop 48721 0 
dm_mirror 54737 0 
dm_multipath 56921 0 
scsi_dh 42177 1 dm_multipath 
raid456 152417 1 
xor 38865 1 raid456 
video 53197 0 
backlight 39873 1 video 
sbs 49921 0 
power_meter 47053 0 
hwmon 36553 1 power_meter 
i2c_ec 38593 1 sbs 
dell_wmi 37601 0 
wmi 41985 1 dell_wmi 
button 40545 0 
battery 43849 0 
asus_acpi 50917 0 
acpi_memhotplug 40517 0 
ac 38729 0 
parport_pc 62313 0 
lp 47121 0 
parport 73165 2 parport_pc,lp 
mlx4_en 107985 0 
joydev 43969 0 
i2c_i801 41813 0 
igb 122709 0 
i2c_core 56641 2 i2c_ec,i2c_i801 
8021q 57425 1 igb 
shpchp 70893 0 
mlx4_core 152773 2 mlx4_ib,mlx4_en 
serio_raw 40517 0 
dca 41221 1 igb 
sg 70377 0 
pcspkr 36289 0 
dm_raid45 99657 0 
dm_message 36289 1 dm_raid45 
dm_region_hash 46145 1 dm_raid45 
dm_log 44993 3 dm_mirror,dm_raid45,dm_region_hash 
dm_mod 101649 4 dm_mirror,dm_multipath,dm_raid45,dm_log 
dm_mem_cache 38977 1 dm_raid45 
mpt2sas 159337 12 
scsi_transport_sas 66753 1 mpt2sas 
ahci 69705 6 
libata 209489 1 ahci 
sd_mod 56513 32 
scsi_mod 196953 10 
ib_iser,libiscsi2,scsi_transport_iscsi2,ib_srp,scsi_dh,sg,mpt2sas,scsi_transport_sas,libata,sd_mod 
raid1 56001 3 
ext3 168913 2 
jbd 94769 1 ext3 
uhci_hcd 57433 0 
ohci_hcd 56309 0 
ehci_hcd 66125 0 
[root at storage0 ~]# ibv_devinfo 
libibverbs: Warning: no userspace device-specific driver found for 
/sys/class/infiniband_verbs/uverbs0 
No IB devices found 
[root at storage0 ~]# lspci 
00:00.0 Host bridge: Intel Corporation 5520 I/O Hub to ESI Port (rev 22) 
00:01.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI 
Express Root Port 1 (rev 22) 
00:03.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI 
Express Root Port 3 (rev 22) 
00:05.0 PCI bridge: Intel Corporation 5520/X58 I/O Hub PCI Express 
Root Port 5 (rev 22) 
00:07.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI 
Express Root Port 7 (rev 22) 
00:09.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI 
Express Root Port 9 (rev 22) 
00:13.0 PIC: Intel Corporation 5520/5500/X58 I/O Hub I/OxAPIC 
Interrupt Controller (rev 22) 
00:14.0 PIC: Intel Corporation 5520/5500/X58 I/O Hub System Management 
Registers (rev 22) 
00:14.1 PIC: Intel Corporation 5520/5500/X58 I/O Hub GPIO and Scratch 
Pad Registers (rev 22) 
00:14.2 PIC: Intel Corporation 5520/5500/X58 I/O Hub Control Status 
and RAS Registers (rev 22) 
00:14.3 PIC: Intel Corporation 5520/5500/X58 I/O Hub Throttle Registers (rev 22) 
00:16.0 System peripheral: Intel Corporation 5520/5500/X58 Chipset 
QuickData Technology Device (rev 22) 
00:16.1 System peripheral: Intel Corporation 5520/5500/X58 Chipset 
QuickData Technology Device (rev 22) 
00:16.2 System peripheral: Intel Corporation 5520/5500/X58 Chipset 
QuickData Technology Device (rev 22) 
00:16.3 System peripheral: Intel Corporation 5520/5500/X58 Chipset 
QuickData Technology Device (rev 22) 
00:16.4 System peripheral: Intel Corporation 5520/5500/X58 Chipset 
QuickData Technology Device (rev 22) 
00:16.5 System peripheral: Intel Corporation 5520/5500/X58 Chipset 
QuickData Technology Device (rev 22) 
00:16.6 System peripheral: Intel Corporation 5520/5500/X58 Chipset 
QuickData Technology Device (rev 22) 
00:16.7 System peripheral: Intel Corporation 5520/5500/X58 Chipset 
QuickData Technology Device (rev 22) 
00:1a.0 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB 
UHCI Controller #4 
00:1a.1 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB 
UHCI Controller #5 
00:1a.2 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB 
UHCI Controller #6 
00:1a.7 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB2 
EHCI Controller #2 
00:1d.0 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB 
UHCI Controller #1 
00:1d.1 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB 
UHCI Controller #2 
00:1d.2 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB 
UHCI Controller #3 
00:1d.7 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB2 
EHCI Controller #1 
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 90) 
00:1f.0 ISA bridge: Intel Corporation 82801JIR (ICH10R) LPC Interface Controller 
00:1f.2 SATA controller: Intel Corporation 82801JI (ICH10 Family) SATA 
AHCI Controller 
00:1f.3 SMBus: Intel Corporation 82801JI (ICH10 Family) SMBus Controller 
01:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network 
Connection (rev 01) 
01:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network 
Connection (rev 01) 
02:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic 
SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 02) 
05:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 
2.0 5GT/s - IB QDR / 10GigE] (rev b0) 
06:01.0 VGA compatible controller: Matrox Graphics, Inc. MGA G200eW 
WPCM450 (rev 0a) 
[root at storage0 ~]# /etc/init.d/openibd status 
Low level hardware support loaded: 
mlx4_ib 

Upper layer protocol modules: 
ib_iser ib_srp rds ib_sdp ib_ipoib 

User space access modules: 
rdma_ucm ib_ucm ib_uverbs ib_umad 

Connection management modules: 
rdma_cm ib_cm iw_cm 

Configured IPoIB interfaces: ib0 
Currently active IPoIB interfaces: ib0 
[root at storage0 ~]# 



.. Lana (lana.deere at gmail.com) 






On Tue, Oct 19, 2010 at 6:48 PM, Craig Carl <craig at gluster.com> wrote: 
> Lana - 
> The first couple of lines of the log identify our problem - 
> 
> [2010-10-19 07:47:49.315416] C [rdma.c:3817:rdma_init] rpc-transport/rdma: 
> No IB devices found 
> [2010-10-19 07:47:49.315438] E [rdma.c:4744:init] rdma.management: Failed to 
> initialize IB Device 
> [2010-10-19 07:47:49.315452] E [rpc-transport.c:965:rpc_transport_load] 
> rpc-transport: 'rdma' initialization failed 
> 
> Are you sure your IB cards are working? Can you send the output of - 
> 
> # lsmod 
> # ibv_devinfo 
> # lspci 
> # /etc/init.d/openibd status 
> 
> 
> 
> Thanks, 
> 
> Craig 
> 
> -- 
> Craig Carl 
> Senior Systems Engineer; Gluster, Inc. 
> Cell - (408) 829-9953 (California, USA) 
> Office - (408) 770-1884 
> Gtalk - craig.carl at gmail.com 
> Twitter - @gluster 
> Installing Gluster Storage Platform, the movie! 
> http://rackerhacker.com/2010/08/11/one-month-with-glusterfs-in-production/ 
> 
> 
> ________________________________ 
> From: "Lana Deere" <lana.deere at gmail.com> 
> To: "Craig Carl" <craig at gluster.com> 
> Cc: gluster-users at gluster.org, landman at scalableinformatics.com 
> Sent: Tuesday, October 19, 2010 3:29:41 PM 
> Subject: Re: hanging "df" (3.1, infiniband) 
> 
> For the last little while I've been using storage0 as both client and 
> server, so those files are both client and server files at the same 
> time. If it would be helpful, I could go back to using a different 
> host as client (but then 'df' will hang instead of reporting the 
> Transport message). 
> 
> [root at storage0 ~]# cat /etc/glusterd/.cmd_log_history 
> [2010-10-19 07:54:36.244333] peer probe : on host storage1:24007 
> [2010-10-19 07:54:36.249891] peer probe : on host storage1:24007 FAILED 
> [2010-10-19 07:54:43.745558] peer probe : on host storage2:24007 
> [2010-10-19 07:54:43.750752] peer probe : on host storage2:24007 FAILED 
> [2010-10-19 07:54:48.915378] peer probe : on host storage3:24007 
> [2010-10-19 07:54:48.920595] peer probe : on host storage3:24007 FAILED 
> [2010-10-19 07:59:49.737251] Volume create : on volname: RaidData attempted 
> [2010-10-19 07:59:49.737314] Volume create : on volname: RaidData 
> type:DEFAULT count:4 bricks: storage0:/data storage1:/data 
> storage2:/data storage3:/data 
> [2010-10-19 07:59:49.737631] Volume create : on volname: RaidData SUCCESS 
> [2010-10-19 08:01:36.909963] volume start : on volname: RaidData SUCCESS 
> 
> The /var/log file was pretty big, so I put it on pastebin: 
> http://pastebin.com/m6WbHPUp 
> 
> 
> .. Lana (lana.deere at gmail.com) 
> 
> 
> 
> 
> 
> 
> On Tue, Oct 19, 2010 at 6:10 PM, Craig Carl <craig at gluster.com> wrote: 
>> Lana - 
>> Can you also post the contents of 
>> 
>> /var/log/glusterfs/etc-glusterfs-glusterd.vol.log 
>> and 
>> /etc/glusterd/.cmd_log_history 
>> 
>> on both the client and server to the list? 
> 


[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux