Gluster Volume hangs (version 3.2.5)

mohitanchlia at gmail.com (Mohit Anchlia) · Thu, 15 Mar 2012 07:54:18 -0700

Can you break your CSV in small chunks and try? It appears that network is
somehow getting overwhelmed. Have you checked switches for any errors?

On Wed, Mar 14, 2012 at 11:39 PM, Alessio Checcucci <
alessio.checcucci at gmail.com> wrote:

>  Dear Mohit,
> thanks for your answer. The setup is pretty new, we have configured it one
> month ago more or less and the iozone tests we performed never highlighted
> any problem.
> The servers are SGI machines based on Supermicro hardware, each one
> features:
> 2 Xeon X5650 6-cores cpus
> 96GB of RAM
> two Intel Gigabit interfaces
> 1 Mellanox ConnectX-2 IB HCA
> 1 LSA 1068E SATA RAID controller
> 6 Seagate ST32000644NS 2TB HDDs
>
> The Gluster nodes work quite smoothly, they act both as bricks and as
> clients, mounting the Gluster filesystem by means of the fuse driver.
> Unfortunately when we run the mongo import (from a huge CSV file) after
> some time (minutes) all the mounts become completely freezed and the fuse
> error (with related timeout) I reported in my first message is logged.
> Looking at the volume log in the Gluster bricks we can see the following
> messages:
>
>  [2012-03-15 04:45:07.352455] E
> [rdma.c:3415:rdma_handle_failed_send_completion] 0-rpc-transport/rdma: send
> work request on `mlx4_0' returned error wc.status = 12, wc.vendor_err =
> 129, post->buf = 0x2b08000, wc.byte_len = 0, post->reused = 2
> [2012-03-15 04:45:07.352510] E
> [rdma.c:3423:rdma_handle_failed_send_completion] 0-rdma: connection between
> client and server not working. check by running 'ibv_srq_pingpong'. also
> make sure subnet manager is running (eg: 'opensm'), or check if rdma port
> is valid (or active) by running 'ibv_devinfo'. contact Gluster Support Team
> if the problem persists.
> [2012-03-15 04:45:07.352535] E
> [rdma.c:3415:rdma_handle_failed_send_completion] 0-rpc-transport/rdma: send
> work request on `mlx4_0' returned error wc.status = 12, wc.vendor_err =
> 129, post->buf = 0x2b0a000, wc.byte_len = 0, post->reused = 5
> [2012-03-15 04:45:07.352545] E
> [rdma.c:3423:rdma_handle_failed_send_completion] 0-rdma: connection between
> client and server not working. check by running 'ibv_srq_pingpong'. also
> make sure subnet manager is running (eg: 'opensm'), or check if rdma port
> is valid (or active) by running 'ibv_devinfo'. contact Gluster Support Team
> if the problem persists.
> [2012-03-15 04:45:07.352900] E [rpc-clnt.c:341:saved_frames_unwind]
> (-->/opt/glusterfs/3.2.5/lib64/libgfrpc.so.0(rpc_clnt_notify+0x78)
> [0x7fda3f424568]
> (-->/opt/glusterfs/3.2.5/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7d)
> [0x7fda3f423cfd]
> (-->/opt/glusterfs/3.2.5/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)
> [0x7fda3f423c5e]))) 0-HPC_data-client-4: forced unwinding frame
> type(GlusterFS 3.1) op(LOOKUP(27)) called at 2012-03-15 04:45:03.336837
> [2012-03-15 04:45:07.352942] E
> [client3_1-fops.c:2228:client3_1_lookup_cbk] 0-glusterfs: remote operation
> failed: Transport endpoint is not connected
> [2012-03-15 04:45:07.352956] I [client.c:1883:client_rpc_notify]
> 0-HPC_data-client-4: disconnected
> [2012-03-15 04:45:07.353301] E [rpc-clnt.c:341:saved_frames_unwind]
> (-->/opt/glusterfs/3.2.5/lib64/libgfrpc.so.0(rpc_clnt_notify+0x78)
> [0x7fda3f424568]
> (-->/opt/glusterfs/3.2.5/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7d)
> [0x7fda3f423cfd]
> (-->/opt/glusterfs/3.2.5/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)
> [0x7fda3f423c5e]))) 0-HPC_data-client-5: forced unwinding frame
> type(GlusterFS 3.1) op(LOOKUP(27)) called at 2012-03-15 04:45:03.336880
> [2012-03-15 04:45:07.353317] E
> [client3_1-fops.c:2228:client3_1_lookup_cbk] 0-glusterfs: remote operation
> failed: Transport endpoint is not connected
> [2012-03-15 04:45:07.353326] I [dht-layout.c:581:dht_layout_normalize]
> 0-HPC_data-dht: found anomalies in /. holes=1 overlaps=0
> [2012-03-15 04:45:07.353335] I [dht-selfheal.c:569:dht_selfheal_directory]
> 0-HPC_data-dht: 2 subvolumes down -- not fixing
> [2012-03-15 04:45:07.353365] I [client.c:1883:client_rpc_notify]
> 0-HPC_data-client-5: disconnected
> [2012-03-15 04:45:17.703676] I
> [client-handshake.c:1090:select_server_supported_programs]
> 0-HPC_data-client-4: Using Program GlusterFS 3.2.5, Num (1298437), Version
> (310)
> [2012-03-15 04:45:17.703857] I
> [client-handshake.c:913:client_setvolume_cbk] 0-HPC_data-client-4:
> Connected to 192.168.100.165:24009, attached to remote volume '/data'.
> [2012-03-15 04:45:17.706408] I
> [client-handshake.c:1090:select_server_supported_programs]
> 0-HPC_data-client-5: Using Program GlusterFS 3.2.5, Num (1298437), Version
> (310)
> [2012-03-15 04:45:17.706566] I
> [client-handshake.c:913:client_setvolume_cbk] 0-HPC_data-client-5:
> Connected to 192.168.100.166:24009, attached to remote volume '/data'.
> [2012-03-15 06:28:09.624927] I [dht-layout.c:581:dht_layout_normalize]
> 0-HPC_data-dht: found anomalies in /database/mongo/hipass_fixed/journal.
> holes=1 overlaps=0
> [2012-03-15 06:28:09.704031] I [dht-layout.c:581:dht_layout_normalize]
> 0-HPC_data-dht: found anomalies in /database/mongo/hipass_fixed/_tmp.
> holes=1 overlaps=0
>
> We checked the Infiniband infrastucture and it is still working, hence we
> suppose that the problem should stay somewhere else.
>
> Thanks a lot for your help,
> Alessio
>
>
>  On 15/03/2012, at 12:32 , Mohit Anchlia wrote:
>
> Is this a new setup and used to work before? How is the CPU, memory etc?
> Also, what do you see in gluster nodes?
>
> On Wed, Mar 14, 2012 at 7:33 PM, Alessio Checcucci <
> alessio.checcucci at gmail.com> wrote:
>
>> Dear All,
>> we are facing a problem in our computer room, we have 6 servers that act
>> like bricks for GlusterFS, the servers are configured in the following way:
>>
>> OS: Centos 6.2 x86_64
>> Kernel: 2.6.32-220.4.2.el6.x86_64
>>
>> Gluster RPM packages:
>>  glusterfs-core-3.2.5-2.el6.x86_64
>> glusterfs-rdma-3.2.5-2.el6.x86_64
>> glusterfs-geo-replication-3.2.5-2.el6.x86_64
>> glusterfs-fuse-3.2.5-2.el6.x86_64
>>
>> Each one is contributing a XFS filesystem to the global volume, the
>> transport mechanism is RDMA:
>>
>> gluster volume create HPC_data transport rdma pleiades01:/data
>> pleiades02:/data pleiades03:/data pleiades04:/data pleiades05:/data
>> pleiades06:/data
>>
>> Each server mounts, using the fuse driver, the volume on a dedicated
>> mount point according to the following fstab:
>>
>> pleiades01:/HPC_data        /HPCdata                glusterfs
>> defaults,_netdev 0 0
>>
>> We are running mongodb on top of the Gluster volume for performance
>> testing and speed is definitely high. Unfortunately when we run a large
>> mongoimport job after short time from the beginning the GlusterFS volume
>> hangs completely and is inaccessible from any node. The following error is
>> logged after some time in /var/log/messages:
>>
>> Mar  8 08:16:03 pleiades03 kernel: INFO: task mongod:5508 blocked for
>> more than 120 seconds.
>> Mar  8 08:16:03 pleiades03 kernel: "echo 0 >
>> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> Mar  8 08:16:03 pleiades03 kernel: mongod        D 0000000000000007     0
>>  5508      1 0x00000000
>> Mar  8 08:16:03 pleiades03 kernel: ffff881709b95de8 0000000000000086
>> 0000000000000000 0000000000000008
>> Mar  8 08:16:03 pleiades03 kernel: ffff881709b95d68 ffffffff81090a7f
>> ffff8816b6974cc0 0000000000000000
>> Mar  8 08:16:03 pleiades03 kernel: ffff8817fdd81af8 ffff881709b95fd8
>> 000000000000f4e8 ffff8817fdd81af8
>> Mar  8 08:16:03 pleiades03 kernel: Call Trace:
>> Mar  8 08:16:03 pleiades03 kernel: [<ffffffff81090a7f>] ?
>> wake_up_bit+0x2f/0x40
>> Mar  8 08:16:03 pleiades03 kernel: [<ffffffff81090d7e>] ?
>> prepare_to_wait+0x4e/0x80
>> Mar  8 08:16:03 pleiades03 kernel: [<ffffffffa112c6b5>]
>> fuse_set_nowrite+0xa5/0xe0 [fuse]
>> Mar  8 08:16:03 pleiades03 kernel: [<ffffffff81090a90>] ?
>> autoremove_wake_function+0x0/0x40
>> Mar  8 08:16:03 pleiades03 kernel: [<ffffffffa112fd48>]
>> fuse_fsync_common+0xa8/0x180 [fuse]
>> Mar  8 08:16:03 pleiades03 kernel: [<ffffffffa112fe30>]
>> fuse_fsync+0x10/0x20 [fuse]
>> Mar  8 08:16:03 pleiades03 kernel: [<ffffffff811a52d1>]
>> vfs_fsync_range+0xa1/0xe0
>> Mar  8 08:16:03 pleiades03 kernel: [<ffffffff811a537d>]
>> vfs_fsync+0x1d/0x20
>> Mar  8 08:16:03 pleiades03 kernel: [<ffffffff81144421>]
>> sys_msync+0x151/0x1e0
>> Mar  8 08:16:03 pleiades03 kernel: [<ffffffff8100b0f2>]
>> system_call_fastpath+0x16/0x1b
>>
>> Any attempt to access the volume from any node is fruitless until the
>> mongodb process is killed, the sessions accessing the /HPCdata path gets
>> freezed on any node.
>> Anyway a complete stop (force) and start of the volume is needed to have
>> it back operational.
>> The situation can be reproduced at will.
>> Is there anybody able to help us? Could we collect more pieces of
>> information to help diagnosing the problem?
>>
>> Thanks a lot
>> Alessio
>>
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gluster.org/pipermail/gluster-users/attachments/20120315/0a275b03/attachment.htm>