hanging "df" (3.1, infiniband)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Lana - 
The first couple of lines of the log identify our problem - 


    1. 
[2010-10-19 07:47:49.315416] C [rdma.c:3817:rdma_init] rpc-transport/rdma: No IB devices found 
    2. 
[2010-10-19 07:47:49.315438] E [rdma.c:4744:init] rdma.management: Failed to initialize IB Device 
    3. 
[2010-10-19 07:47:49.315452] E [rpc-transport.c:965:rpc_transport_load] rpc-transport: 'rdma' initialization failed 

Are you sure your IB cards are working? Can you send the output of - 

# lsmod 
# ibv_devinfo 
# lspci 
# /etc/init.d/openibd status 





Thanks, 

Craig 

-- 
Craig Carl 













Senior Systems Engineer; Gluster, Inc. 
Cell - ( 408) 829-9953 (California, USA) 
Office - ( 408) 770-1884 
Gtalk - craig.carl at gmail.com 
Twitter - @gluster 
Installing Gluster Storage Platform, the movie! 
http://rackerhacker.com/2010/08/11/one-month-with-glusterfs-in-production/ 



From: "Lana Deere" <lana.deere at gmail.com> 
To: "Craig Carl" <craig at gluster.com> 
Cc: gluster-users at gluster.org, landman at scalableinformatics.com 
Sent: Tuesday, October 19, 2010 3:29:41 PM 
Subject: Re: hanging "df" (3.1, infiniband) 

For the last little while I've been using storage0 as both client and 
server, so those files are both client and server files at the same 
time. If it would be helpful, I could go back to using a different 
host as client (but then 'df' will hang instead of reporting the 
Transport message). 

[root at storage0 ~]# cat /etc/glusterd/.cmd_log_history 
[2010-10-19 07:54:36.244333] peer probe : on host storage1:24007 
[2010-10-19 07:54:36.249891] peer probe : on host storage1:24007 FAILED 
[2010-10-19 07:54:43.745558] peer probe : on host storage2:24007 
[2010-10-19 07:54:43.750752] peer probe : on host storage2:24007 FAILED 
[2010-10-19 07:54:48.915378] peer probe : on host storage3:24007 
[2010-10-19 07:54:48.920595] peer probe : on host storage3:24007 FAILED 
[2010-10-19 07:59:49.737251] Volume create : on volname: RaidData attempted 
[2010-10-19 07:59:49.737314] Volume create : on volname: RaidData 
type:DEFAULT count:4 bricks: storage0:/data storage1:/data 
storage2:/data storage3:/data 
[2010-10-19 07:59:49.737631] Volume create : on volname: RaidData SUCCESS 
[2010-10-19 08:01:36.909963] volume start : on volname: RaidData SUCCESS 

The /var/log file was pretty big, so I put it on pastebin: 
http://pastebin.com/m6WbHPUp 


.. Lana (lana.deere at gmail.com) 






On Tue, Oct 19, 2010 at 6:10 PM, Craig Carl <craig at gluster.com> wrote: 
> Lana - 
> Can you also post the contents of 
> 
> /var/log/glusterfs/etc-glusterfs-glusterd.vol.log 
> and 
> /etc/glusterd/.cmd_log_history 
> 
> on both the client and server to the list? 


[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux