Re: glusterd 100% cpu upon volume status inode

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



There is nothing wrong with your setup. This is a known issue (at least to me).

The problem here lies with how GlusterD collect and collate the information on the open inodes on a volume, which isn't really efficient as of now. The collection and collation process involves doing several small (at least 2, but pretty sure it's more) memory allocations for each inode open on the bricks. This doesn't really scale well when we have lots of files, and is CPU and memory intensive.

In your case, with a 3-way replica volume, you'd have inodes atleast 3x the number of files (~150000). This means atleast 300k small memory allocations need to be done by GlusterD. This is going to take a lot of time, CPU time and memory to complete. The process will eventually complete provided you have enough memory available. But as the gluster CLI only waits for 2 minutes for a reply, you will not get to see the output as you've experienced. But GlusterD will continue and finish the asked operation.

Also, other CLI commands will fail till the existing operation finishes. GlusterD acquires a transaction lock when it begins an operation and releases it once the operation is complete. As GlusterD still continues with the operation after CLI times out, newer commands will fail as they cannot get the lock.

~kaushal

On Wed, Feb 11, 2015 at 4:40 AM, Rumen Telbizov <telbizov@xxxxxxxxx> wrote:
Hello everyone,

I am new to GlusterFS and I am in the process of evaluating it as a possible alternative to some other options. While playing with it I came across this problem. Please direct me if there's something wrong that I am might be doing.

When I run volume status myvolume inode it causes the glusterd process to hit 100% cpu utilization and no commands work furthermore. If I restart the glusterd process the problem is "resolved" until I run the same command again. Here's some more debug:

# time gluster volume status myvolume inode
real    2m0.095s

...
[2015-02-10 22:49:38.662545] E [name.c:147:client_fill_address_family] 0-glusterfs: transport.address-family not specified. Could not guess default value from (remote-host:(null) or transport.unix.connect-path:(null)) options
[2015-02-10 22:49:41.663081] W [dict.c:1055:data_to_str] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.3/rpc-transport/socket.so(+0x4e24) [0x7fb21d6d2e24] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.3/rpc-transport/socket.so(socket_client_get_remote_sockaddr+0x4e) [0x7fb21d6d990e] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.3/rpc-transport/socket.so(client_fill_address_family+0x202) [0x7fb21d6d95f2]))) 0-dict: data is NULL
[2015-02-10 22:49:41.663101] W [dict.c:1055:data_to_str] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.3/rpc-transport/socket.so(+0x4e24) [0x7fb21d6d2e24] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.3/rpc-transport/socket.so(socket_client_get_remote_sockaddr+0x4e) [0x7fb21d6d990e] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.3/rpc-transport/socket.so(client_fill_address_family+0x20d) [0x7fb21d6d95fd]))) 0-dict: data is NULL
[2015-02-10 22:49:41.663107] E [name.c:147:client_fill_address_family] 0-glusterfs: transport.address-family not specified. Could not guess default value from (remote-host:(null) or transport.unix.connect-path:(null)) options
[2015-02-10 22:49:44.663576] W [dict.c:1055:data_to_str] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.3/rpc-transport/socket.so(+0x4e24) [0x7fb21d6d2e24] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.3/rpc-transport/socket.so(socket_client_get_remote_sockaddr+0x4e) [0x7fb21d6d990e] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.3/rpc-transport/socket.so(client_fill_address_family+0x202) [0x7fb21d6d95f2]))) 0-dict: data is NULL
[2015-02-10 22:49:44.663595] W [dict.c:1055:data_to_str] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.3/rpc-transport/socket.so(+0x4e24) [0x7fb21d6d2e24] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.3/rpc-transport/socket.so(socket_client_get_remote_sockaddr+0x4e) [0x7fb21d6d990e] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.3/rpc-transport/socket.so(client_fill_address_family+0x20d) [0x7fb21d6d95fd]))) 0-dict: data is NULL
[2015-02-10 22:49:44.663601] E [name.c:147:client_fill_address_family] 0-glusterfs: transport.address-family not specified. Could not guess default value from (remote-host:(null) or transport.unix.connect-path:(null)) options
[2015-02-10 22:49:47.664111] W [dict.c:1055:data_to_str] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.3/rpc-transport/socket.so(+0x4e24) [0x7fb21d6d2e24] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.3/rpc-transport/socket.so(socket_client_get_remote_sockaddr+0x4e) [0x7fb21d6d990e] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.3/rpc-transport/socket.so(client_fill_address_family+0x202) [0x7fb21d6d95f2]))) 0-dict: data is NULL
[2015-02-10 22:49:47.664131] W [dict.c:1055:data_to_str] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.3/rpc-transport/socket.so(+0x4e24) [0x7fb21d6d2e24] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.3/rpc-transport/socket.so(socket_client_get_remote_sockaddr+0x4e) [0x7fb21d6d990e] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.3/rpc-transport/socket.so(client_fill_address_family+0x20d) [0x7fb21d6d95fd]))) 0-dict: data is NULL
[2015-02-10 22:49:47.664137] E [name.c:147:client_fill_address_family] 0-glusterfs: transport.address-family not specified. Could not guess default value from (remote-host:(null) or transport.unix.connect-path:(null)) options
[2015-02-10 22:49:47.728428] I [input.c:36:cli_batch] 0-: Exiting with: 110



# time gluster volume status
Another transaction is in progress. Please try again after sometime.
real    0m10.223s

[2015-02-10 22:50:29.937290] E [glusterd-utils.c:153:glusterd_lock] 0-management: Unable to get lock for uuid: c7d1e1ea-c5a5-4bcf-802c-aa04dd2e55ba, lock held by: c7d1e1ea-c5a5-4bcf-802c-aa04dd2e55ba
[2015-02-10 22:50:29.937316] E [glusterd-syncop.c:1221:gd_sync_task_begin] 0-management: Unable to acquire lock


The volume contains the extracted linux kernel - so lots of small files (48425). Here's the configuration:

# gluster volume status
Status of volume: myvolume
Gluster process                     Port    Online  Pid
------------------------------------------------------------------------------
Brick 10.12.10.7:/var/lib/glusterfs_disks/disk01/brick  49152   Y   3321
Brick 10.12.10.8:/var/lib/glusterfs_disks/disk01/brick  49152   Y   3380
Brick 10.12.10.9:/var/lib/glusterfs_disks/disk01/brick  49152   Y   3359
Brick 10.12.10.7:/var/lib/glusterfs_disks/disk02/brick  49154   Y   18687
Brick 10.12.10.8:/var/lib/glusterfs_disks/disk02/brick  49156   Y   32699
Brick 10.12.10.9:/var/lib/glusterfs_disks/disk02/brick  49154   Y   17932
Self-heal Daemon on localhost               N/A Y   25005
Self-heal Daemon on 10.12.10.9              N/A Y   17952
Self-heal Daemon on 10.12.10.8              N/A Y   32724

Task Status of Volume myvolume
------------------------------------------------------------------------------
Task                 : Rebalance
ID                   : eec4f2c1-85f5-400d-ac42-6da63ec7434f
Status               : completed



# gluster volume info

Volume Name: myvolume
Type: Distributed-Replicate
Volume ID: e513a56f-049f-4c8e-bc75-4fb789e06c37
Status: Started
Number of Bricks: 2 x 3 = 6
Transport-type: tcp
Bricks:
Brick1: 10.12.10.7:/var/lib/glusterfs_disks/disk01/brick
Brick2: 10.12.10.8:/var/lib/glusterfs_disks/disk01/brick
Brick3: 10.12.10.9:/var/lib/glusterfs_disks/disk01/brick
Brick4: 10.12.10.7:/var/lib/glusterfs_disks/disk02/brick
Brick5: 10.12.10.8:/var/lib/glusterfs_disks/disk02/brick
Brick6: 10.12.10.9:/var/lib/glusterfs_disks/disk02/brick
Options Reconfigured:
nfs.disable: on
network.ping-timeout: 10


I run:
# glusterd -V
glusterfs 3.5.3 built on Nov 17 2014 15:48:52
Repository revision: git://git.gluster.com/glusterfs.git


Thank you for your time.

​Regards,
--

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux