Need some advices regarding glusterd memory leak upto 120GB

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

I am running gluster-3.4.5 on 2 servers. Each of them has 7 2TB HDDs to build a 7 * 2 distributed + replicated volume.
I just notice that the glusterd consume about 120GB memory and get a coredump today. I read the mempool code try to identify which mempool eat the memory. Unfortunetly, the glusterd did not run with --mem-accounting. Now I just have a coredump file to debug... Anyway, I read some codes about mem_pool try to identify which mem_pool consumes such large memory. Here is the result:

I wrote a gdb script to print out the glusterfsd_ctx->mempool_list:

# script of gdb to print out all none-zero mem_pool
set $head = &glusterfsd_ctx->mempool_list
set $offset = (unsigned long)(&((struct mem_pool*)0)->global_list)
set $pos = (struct mem_pool*)((unsigned long)($head->next) - $offset)
set $memsum = 0
while ( &$pos->global_list != $head)
if ($pos->hot_count + $pos->curr_stdalloc)
p *$pos
set $thismempoolsize = ($pos->hot_count + $pos->curr_stdalloc) * $pos->padded_sizeof_type
# This is the single mem_pool memory consume
p $pos->name
p $thismempoolsize
set $memsum += $thismempoolsize
end
set $pos = (struct mem_pool*)((unsigned long)($pos->global_list.next) - $offset)
end
echo "Total mem used\n"
p $memsum

Then I got this output:

(gdb) source gdb_show_mempool_list.gdb 
$459 = {list = {next = 0x1625a50, prev = 0x1625a50}, hot_count = 64, cold_count = 0, lock = 1, padded_sizeof_type = 6116, pool = 0x7ff2c9f94010, pool_end = 0x7ff2c9ff3910, real_sizeof_type = 6088, 
  alloc_count = 16919588, pool_misses = 16919096, max_alloc = 64, curr_stdalloc = 16824653, max_stdalloc = 16824655, name = 0x1625ad0 "management:rpcsvc_request_t", global_list = {next = 0x16211f8, 
    prev = 0x1639368}}
$460 = 0x1625ad0 "management:rpcsvc_request_t"
$461 = 102899969172
$462 = {list = {next = 0x7ff2cc0bf374, prev = 0x7ff2cc0bc2b4}, hot_count = 16352, cold_count = 32, lock = 1, padded_sizeof_type = 52, pool = 0x7ff2cc0bc010, pool_end = 0x7ff2cc18c010, 
  real_sizeof_type = 24, alloc_count = 169845909, pool_misses = 168448980, max_alloc = 16384, curr_stdalloc = 168231365, max_stdalloc = 168231560, name = 0x1621210 "glusterfs:data_t", global_list = {
    next = 0x1621158, prev = 0x1625ab8}}
$463 = 0x1621210 "glusterfs:data_t"
$464 = 8748881284
$465 = {list = {next = 0x7ff2cc18e770, prev = 0x7ff2cc18d2fc}, hot_count = 16350, cold_count = 34, lock = 1, padded_sizeof_type = 68, pool = 0x7ff2cc18d010, pool_end = 0x7ff2cc29d010, 
  real_sizeof_type = 40, alloc_count = 152853817, pool_misses = 151477891, max_alloc = 16384, curr_stdalloc = 151406417, max_stdalloc = 151406601, name = 0x1621170 "glusterfs:data_pair_t", 
  global_list = {next = 0x16210b8, prev = 0x16211f8}}
$466 = 0x1621170 "glusterfs:data_pair_t"
$467 = 10296748156
$468 = {list = {next = 0x1621050, prev = 0x1621050}, hot_count = 4096, cold_count = 0, lock = 1, padded_sizeof_type = 140, pool = 0x7ff2cc29e010, pool_end = 0x7ff2cc32a010, real_sizeof_type = 112, 
  alloc_count = 16995288, pool_misses = 16986651, max_alloc = 4096, curr_stdalloc = 16820855, max_stdalloc = 16820882, name = 0x16210d0 "glusterfs:dict_t", global_list = {next = 0x1621018, 
    prev = 0x1621158}}
$469 = 0x16210d0 "glusterfs:dict_t"
$470 = 2355493140
"Total mem used
"$471 = 124301091752

--------------------------------------------------------------------------------------
"management:rpcsvc_request_t" used 100G
"glusterfs:data_t" used 8.7GB
"glusterfs:data_pair_t" used 10GB
"glusterfs:dict_t" use 2.3G
Total: 124GB memory

---------------------------------------------------------------------------------------
I assume this might happen in a lot of rpc request and not free. 
This happened several days ago, I am still trying to figure out what happen several days ago on my servers.
Hope someone here might encountered this issue before, or any advices will be grateful!!!









_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux