Hi Oleksandr,
On 01/02/16 19:28, Oleksandr Natalenko wrote:
Please take a look at updated test results.
Test: find /mnt/volume -type d
RAM usage after "find" finishes: ~ 10.8G (see "ps" output [1]).
Statedump after "find" finishes: [2].
Then I did drop_caches, and RAM usage dropped to ~4.7G [3].
Statedump after drop_caches: [4].
The statedump seems pretty clean now.
Here is diff between statedumps: [5].
And, finally, Valgrind output: [6].
Valgrind doesn't show a major memory leak either.
Definitely, no major leaks on exit, but why glusterfs process uses almost 5G
of RAM after drop_caches?
Could it be memory used by Valgrind itself to track glusterfs' memory
usage ?
Could you repeat the test without Valgrind and see if the memory usage
after dropping caches returns to low values ?
Examining statedump shows only the following snippet
with high "size" value:
===
[mount/fuse.fuse - usage-type gf_fuse_mt_iov_base memusage]
size=4234592647
num_allocs=1
max_size=4294935223
max_num_allocs=3
total_allocs=4186991
===
Another leak?
Grepping "gf_fuse_mt_iov_base" on GlusterFS source tree shows the following:
===
$ grep -Rn gf_fuse_mt_iov_base
xlators/mount/fuse/src/fuse-mem-types.h:20: gf_fuse_mt_iov_base,
xlators/mount/fuse/src/fuse-bridge.c:4887:
gf_fuse_mt_iov_base);
===
fuse-bridge.c snippet:
===
/* Add extra 128 byte to the first iov so that it can
* accommodate "ordinary" non-write requests. It's not
* guaranteed to be big enough, as SETXATTR and namespace
* operations with very long names may grow behind it,
* but it's good enough in most cases (and we can handle
* rest via realloc).
*/
iov_in[0].iov_base = GF_CALLOC (1, msg0_size,
gf_fuse_mt_iov_base);
===
Probably, some freeing missing for iov_base?
This is not a real memory leak. It's only a bad accounting of memory.
Note that num_allocs is 1. If you look at libglusterfs/src/mem-pool.c,
you will see this:
/* TBD: it would be nice to adjust the memory accounting info here,
* but calling gf_mem_set_acct_info here is wrong because it bumps
* up counts as though this is a new allocation - which it's not.
* The consequence of doing nothing here is only that the sizes will be
* wrong, but at least the counts won't be.
uint32_t type = 0;
xlator_t *xl = NULL;
type = header->type;
xl = (xlator_t *) header->xlator;
gf_mem_set_acct_info (xl, &new_ptr, size, type, NULL);
*/
This means that memory reallocs are not correctly accounted, so the
tracked size is incorrect (note that fuse_thread_proc() calls
GF_REALLOC() in some cases).
There are two problems here:
1. The memory is allocated with a given size S1, then reallocated with a
size S2 (S2 > S1), but not accounted, so the memory accounting system
still thinks that the allocated size is S1. When memory is freed, S2 is
substracted from the total size used. With enough allocs/reallocs/frees,
this value becomes negative.
2. statedump shows the 64-bit 'size' field representing the total memory
used by a given type as an unsigned 32-bit value, loosing some information.
Xavi
[1] https://gist.github.com/f0cf98e8bff0c13ea38f
[2] https://gist.github.com/87baa0a778ba54f0f7f7
[3] https://gist.github.com/7013b493d19c8c5fffae
[4] https://gist.github.com/cc38155b57e68d7e86d5
[5] https://gist.github.com/6a24000c77760a97976a
[6] https://gist.github.com/74bd7a9f734c2fd21c33
On понеділок, 1 лютого 2016 р. 14:24:22 EET Soumya Koduri wrote:
On 02/01/2016 01:39 PM, Oleksandr Natalenko wrote:
Wait. It seems to be my bad.
Before unmounting I do drop_caches (2), and glusterfs process CPU usage
goes to 100% for a while. I haven't waited for it to drop to 0%, and
instead perform unmount. It seems glusterfs is purging inodes and that's
why it uses 100% of CPU. I've re-tested it, waiting for CPU usage to
become normal, and got no leaks.
Will verify this once again and report more.
BTW, if that works, how could I limit inode cache for FUSE client? I do
not want it to go beyond 1G, for example, even if I have 48G of RAM on
my server.
Its hard-coded for now. For fuse the lru limit (of the inodes which are
not active) is (32*1024).
One of the ways to address this (which we were discussing earlier) is to
have an option to configure inode cache limit. If that sounds good, we
can then check on if it has to be global/volume-level, client/server/both.
Thanks,
Soumya
01.02.2016 09:54, Soumya Koduri написав:
On 01/31/2016 03:05 PM, Oleksandr Natalenko wrote:
Unfortunately, this patch doesn't help.
RAM usage on "find" finish is ~9G.
Here is statedump before drop_caches: https://gist.github.com/
fc1647de0982ab447e20
[mount/fuse.fuse - usage-type gf_common_mt_inode_ctx memusage]
size=706766688
num_allocs=2454051
And after drop_caches: https://gist.github.com/5eab63bc13f78787ed19
[mount/fuse.fuse - usage-type gf_common_mt_inode_ctx memusage]
size=550996416
num_allocs=1913182
There isn't much significant drop in inode contexts. One of the
reasons could be because of dentrys holding a refcount on the inodes
which shall result in inodes not getting purged even after
fuse_forget.
pool-name=fuse:dentry_t
hot-count=32761
if '32761' is the current active dentry count, it still doesn't seem
to match up to inode count.
Thanks,
Soumya
And here is Valgrind output:
https://gist.github.com/2490aeac448320d98596
On субота, 30 січня 2016 р. 22:56:37 EET Xavier Hernandez wrote:
There's another inode leak caused by an incorrect counting of
lookups on directory reads.
Here's a patch that solves the problem for
3.7:
http://review.gluster.org/13324
Hopefully with this patch the
memory leaks should disapear.
Xavi
On 29.01.2016 19:09, Oleksandr
Natalenko wrote:
Here is intermediate summary of current memory
leaks in FUSE client
investigation.
I use GlusterFS v3.7.6
release with the following patches:
===
Kaleb S KEITHLEY (1):
fuse: use-after-free fix in fuse-bridge, revisited
Pranith Kumar K
(1):
mount/fuse: Fix use-after-free crash
Soumya Koduri (3):
gfapi: Fix inode nlookup counts
inode: Retire the inodes from the lru
list in inode_table_destroy
upcall: free the xdr* allocations
===
With those patches we got API leaks fixed (I hope, brief tests show
that) and
got rid of "kernel notifier loop terminated" message.
Nevertheless, FUSE
client still leaks.
I have several test
volumes with several million of small files (100K…2M in
average). I
do 2 types of FUSE client testing:
1) find /mnt/volume -type d
2)
rsync -av -H /mnt/source_volume/* /mnt/target_volume/
And most
up-to-date results are shown below:
=== find /mnt/volume -type d
===
Memory consumption: ~4G
Statedump:
https://gist.github.com/10cde83c63f1b4f1dd7a
Valgrind:
https://gist.github.com/097afb01ebb2c5e9e78d
I guess,
fuse-bridge/fuse-resolve. related.
=== rsync -av -H
/mnt/source_volume/* /mnt/target_volume/ ===
Memory consumption:
~3.3...4G
Statedump (target volume):
https://gist.github.com/31e43110eaa4da663435
Valgrind (target volume):
https://gist.github.com/f8e0151a6878cacc9b1a
I guess,
DHT-related.
Give me more patches to test :).
_______________________________________________
Gluster-devel mailing
list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users