There is nothing wrong with your setup. This is a known issue (at least to me).
The problem here lies with how GlusterD collect and collate the information on the open inodes on a volume, which isn't really efficient as of now. The collection and collation process involves doing several small (at least 2, but pretty sure it's more) memory allocations for each inode open on the bricks. This doesn't really scale well when we have lots of files, and is CPU and memory intensive.
In your case, with a 3-way replica volume, you'd have inodes atleast 3x the number of files (~150000). This means atleast 300k small memory allocations need to be done by GlusterD. This is going to take a lot of time, CPU time and memory to complete. The process will eventually complete provided you have enough memory available. But as the gluster CLI only waits for 2 minutes for a reply, you will not get to see the output as you've experienced. But GlusterD will continue and finish the asked operation.
Also, other CLI commands will fail till the existing operation finishes. GlusterD acquires a transaction lock when it begins an operation and releases it once the operation is complete. As GlusterD still continues with the operation after CLI times out, newer commands will fail as they cannot get the lock.
~kaushal
On Wed, Feb 11, 2015 at 4:40 AM, Rumen Telbizov <telbizov@xxxxxxxxx> wrote:
Hello everyone,
I am new to GlusterFS and I am in the process of evaluating it as a possible alternative to some other options. While playing with it I came across this problem. Please direct me if there's something wrong that I am might be doing.When I run volume status myvolume inode it causes the glusterd process to hit 100% cpu utilization and no commands work furthermore. If I restart the glusterd process the problem is "resolved" until I run the same command again. Here's some more debug:# time gluster volume status
# time gluster volume status myvolume inode
real 2m0.095s
...
[2015-02-10 22:49:38.662545] E [name.c:147:client_fill_address_family] 0-glusterfs: transport.address-family not specified. Could not guess default value from (remote-host:(null) or transport.unix.connect-path:(null)) options
[2015-02-10 22:49:41.663081] W [dict.c:1055:data_to_str] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.3/rpc-transport/socket.so(+0x4e24) [0x7fb21d6d2e24] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.3/rpc-transport/socket.so(socket_client_get_remote_sockaddr+0x4e) [0x7fb21d6d990e] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.3/rpc-transport/socket.so(client_fill_address_family+0x202) [0x7fb21d6d95f2]))) 0-dict: data is NULL
[2015-02-10 22:49:41.663101] W [dict.c:1055:data_to_str] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.3/rpc-transport/socket.so(+0x4e24) [0x7fb21d6d2e24] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.3/rpc-transport/socket.so(socket_client_get_remote_sockaddr+0x4e) [0x7fb21d6d990e] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.3/rpc-transport/socket.so(client_fill_address_family+0x20d) [0x7fb21d6d95fd]))) 0-dict: data is NULL
[2015-02-10 22:49:41.663107] E [name.c:147:client_fill_address_family] 0-glusterfs: transport.address-family not specified. Could not guess default value from (remote-host:(null) or transport.unix.connect-path:(null)) options
[2015-02-10 22:49:44.663576] W [dict.c:1055:data_to_str] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.3/rpc-transport/socket.so(+0x4e24) [0x7fb21d6d2e24] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.3/rpc-transport/socket.so(socket_client_get_remote_sockaddr+0x4e) [0x7fb21d6d990e] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.3/rpc-transport/socket.so(client_fill_address_family+0x202) [0x7fb21d6d95f2]))) 0-dict: data is NULL
[2015-02-10 22:49:44.663595] W [dict.c:1055:data_to_str] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.3/rpc-transport/socket.so(+0x4e24) [0x7fb21d6d2e24] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.3/rpc-transport/socket.so(socket_client_get_remote_sockaddr+0x4e) [0x7fb21d6d990e] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.3/rpc-transport/socket.so(client_fill_address_family+0x20d) [0x7fb21d6d95fd]))) 0-dict: data is NULL
[2015-02-10 22:49:44.663601] E [name.c:147:client_fill_address_family] 0-glusterfs: transport.address-family not specified. Could not guess default value from (remote-host:(null) or transport.unix.connect-path:(null)) options
[2015-02-10 22:49:47.664111] W [dict.c:1055:data_to_str] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.3/rpc-transport/socket.so(+0x4e24) [0x7fb21d6d2e24] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.3/rpc-transport/socket.so(socket_client_get_remote_sockaddr+0x4e) [0x7fb21d6d990e] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.3/rpc-transport/socket.so(client_fill_address_family+0x202) [0x7fb21d6d95f2]))) 0-dict: data is NULL
[2015-02-10 22:49:47.664131] W [dict.c:1055:data_to_str] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.3/rpc-transport/socket.so(+0x4e24) [0x7fb21d6d2e24] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.3/rpc-transport/socket.so(socket_client_get_remote_sockaddr+0x4e) [0x7fb21d6d990e] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.3/rpc-transport/socket.so(client_fill_address_family+0x20d) [0x7fb21d6d95fd]))) 0-dict: data is NULL
[2015-02-10 22:49:47.664137] E [name.c:147:client_fill_address_family] 0-glusterfs: transport.address-family not specified. Could not guess default value from (remote-host:(null) or transport.unix.connect-path:(null)) options
[2015-02-10 22:49:47.728428] I [input.c:36:cli_batch] 0-: Exiting with: 110
Another transaction is in progress. Please try again after sometime.
real 0m10.223s
[2015-02-10 22:50:29.937290] E [glusterd-utils.c:153:glusterd_lock] 0-management: Unable to get lock for uuid: c7d1e1ea-c5a5-4bcf-802c-aa04dd2e55ba, lock held by: c7d1e1ea-c5a5-4bcf-802c-aa04dd2e55ba
[2015-02-10 22:50:29.937316] E [glusterd-syncop.c:1221:gd_sync_task_begin] 0-management: Unable to acquire lockThe volume contains the extracted linux kernel - so lots of small files (48425). Here's the configuration:
# gluster volume status
Status of volume: myvolume
Gluster process Port Online Pid
------------------------------------------------------------------------------
Brick 10.12.10.7:/var/lib/glusterfs_disks/disk01/brick 49152 Y 3321
Brick 10.12.10.8:/var/lib/glusterfs_disks/disk01/brick 49152 Y 3380
Brick 10.12.10.9:/var/lib/glusterfs_disks/disk01/brick 49152 Y 3359
Brick 10.12.10.7:/var/lib/glusterfs_disks/disk02/brick 49154 Y 18687
Brick 10.12.10.8:/var/lib/glusterfs_disks/disk02/brick 49156 Y 32699
Brick 10.12.10.9:/var/lib/glusterfs_disks/disk02/brick 49154 Y 17932
Self-heal Daemon on localhost N/A Y 25005
Self-heal Daemon on 10.12.10.9 N/A Y 17952
Self-heal Daemon on 10.12.10.8 N/A Y 32724
Task Status of Volume myvolume
------------------------------------------------------------------------------
Task : Rebalance
ID : eec4f2c1-85f5-400d-ac42-6da63ec7434f
Status : completed
# gluster volume info
Volume Name: myvolume
Type: Distributed-Replicate
Volume ID: e513a56f-049f-4c8e-bc75-4fb789e06c37
Status: Started
Number of Bricks: 2 x 3 = 6
Transport-type: tcp
Bricks:
Brick1: 10.12.10.7:/var/lib/glusterfs_disks/disk01/brick
Brick2: 10.12.10.8:/var/lib/glusterfs_disks/disk01/brick
Brick3: 10.12.10.9:/var/lib/glusterfs_disks/disk01/brick
Brick4: 10.12.10.7:/var/lib/glusterfs_disks/disk02/brick
Brick5: 10.12.10.8:/var/lib/glusterfs_disks/disk02/brick
Brick6: 10.12.10.9:/var/lib/glusterfs_disks/disk02/brick
Options Reconfigured:
nfs.disable: on
network.ping-timeout: 10I run:
# glusterd -V
glusterfs 3.5.3 built on Nov 17 2014 15:48:52
Repository revision: git://git.gluster.com/glusterfs.gitThank you for your time.Regards,--Rumen Telbizov
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users