Search Postgresql Archives

Re: Debugging leaking memory in Postgresql 13.2/Postgis 3.1

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 30.03.2021 20:46, Tom Lane wrote:
Stephan Knauss <pgsql@xxxxxxxxxxxxxxxxxx> writes:
The wiki suggested to dump MemoryContext states for more details, but
something strange happens when attaching gdb. It seems that the process
is immediately killed and I can no longer dump such details.
(I think the -v option is the one that matters on Linux, not -d
as you might guess).  The idea here is that the backends would
get an actual ENOMEM failure from malloc() before reaching the
point where the kernel's OOM-kill behavior takes over.  Given
that, they'd dump memory maps to stderr of their own accord,
and you could maybe get some insight as to what's leaking.
This'd also reduce the severity of the problem when it does
happen.

Hello Tom, the output below looks similar to the OOM output you expected. Can you give a hint how to interpret the results?

I had a backend which had a larger amount of memory allocated already. So I gave "gcore -a" a try.

In contrast to the advertised behavior, the process did not continue to run but I got a core file at least. Probably related to gcore just calling gdb attach which somehow triggers a SIGKILL of all backends.

With 4.2GB in size it hopefully has most of the relevant memory structures are there. Without a running process I still can not call MemoryContextStats(), but I found a macro which claims to decode the memory structure post mortem:

https://www.cybertec-postgresql.com/en/checking-per-memory-context-memory-consumption/


This gave me the following memory structure:

How should it be interpreted? It looks like the size is bytes as it calculates with pointers. But the numbers look a bit small, given that I had a backend with roughly 6GB RSS memory.

I thought it might print overall size and then indent and print the memory of children, but the numbers do indicate this is not the case, having a higher level smaller size than children:

  CachedPlanSource: 67840
   unnamed prepared statement: 261920

So how to read it and any indication why I have a constantly increasing memory footprint? Is there any indication where multiple gigabytes are allocated?



root@0ec98d20bda2:/# gdb /usr/lib/postgresql/13/bin/postgres core.154218 <gdb-context
GNU gdb (Debian 8.2.1-2+b3) 8.2.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/lib/postgresql/13/bin/postgres...Reading symbols from /usr/lib/debug/.build-id/31/ae2853776500091d313e76cf679017e697884b.debug...done.
done.

warning: core file may not match specified executable file.
[New LWP 154218]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `postgres: osm gis 172.20.0.3(51894) idle'.
#0  0x00007fc01cfa07b7 in epoll_wait (epfd=4, events=0x55f403584080, maxevents=maxevents@entry=1, timeout=timeout@entry=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30 30      ../sysdeps/unix/sysv/linux/epoll_wait.c: No such file or directory. (gdb) >>>> > > >>>(gdb) (gdb) >>>> > > >>>>> > > >>(gdb) (gdb) TopMemoryContext: 109528
 dynahash: 7968
 HandleParallelMessages: 7968
 dynahash: 7968
 dynahash: 7968
 dynahash: 7968
 dynahash: 24392
 dynahash: 24352
 RowDescriptionContext: 24352
 MessageContext: 7968
 dynahash: 7968
 dynahash: 32544
 TransactionAbortContext: 32544
 dynahash: 7968
 TopPortalContext: 7968
 dynahash: 16160
 CacheMemoryContext: 1302944
  CachedPlan: 138016
  CachedPlanSource: 67840
   unnamed prepared statement: 261920
  index info: 1824
  index info: 1824
  index info: 3872
  index info: 1824
  index info: 1824
  index info: 3872
  index info: 3872
  index info: 3872
  index info: 1824
  index info: 3872
  relation rules: 32544
  index info: 1824
  index info: 1824
  index info: 1824
  index info: 3872
  relation rules: 24352
  index info: 3872
  index info: 3872
  index info: 1824
  index info: 3872
  index info: 3872
  index info: 3872
  index info: 1824
  index info: 3872
  index info: 1824
  index info: 3872
  relation rules: 32544
  index info: 1824
  index info: 2848
  index info: 1824
  index info: 3872
  index info: 3872
  index info: 3872
  index info: 3872
  index info: 3872
  index info: 3872
  index info: 3872
  index info: 1824
  index info: 3872
  index info: 1824
  index info: 1824
  relation rules: 32544
  index info: 1824
  index info: 2848
  index info: 1824
  index info: 800
  index info: 1824
  index info: 800
  index info: 800
  index info: 2848
  index info: 1824
  index info: 800
  index info: 800
  index info: 800
  index info: 2848
  index info: 1824
  index info: 1824
--Type <RET> for more, q to quit, c to continue without paging--  index info: 2848
  index info: 1824
  index info: 1824
  index info: 800
  index info: 1824
  index info: 800
  index info: 800
  index info: 800
  index info: 2848
  index info: 2848
  index info: 1824
  index info: 1824
  index info: 800
  index info: 800
  index info: 2848
  index info: 800
  index info: 1824
  index info: 1824
  index info: 800
  index info: 1824
  index info: 1824
  index info: 1824
  index info: 800
  index info: 1824
  index info: 1824
  index info: 1824
  index info: 800
  index info: 2848
  index info: 2848
  index info: 2848
  index info: 800
  index info: 800
  index info: 1824
  index info: 1824
  index info: 1824
  index info: 800
  index info: 1824
  index info: 1824
  index info: 2848
  index info: 1824
  index info: 1824
  index info: 1824
  index info: 1824
  index info: 800
  index info: 1824
  index info: 2848
  index info: 800
  index info: 1824
  index info: 800
  index info: 1824
  index info: 1824
  index info: 800
  index info: 1824
  index info: 1824
  index info: 1824
  index info: 800
  index info: 1824
  index info: 2848
  index info: 1824
  index info: 1824
  index info: 1824
  index info: 1824
  index info: 1824
  index info: 1824
  index info: 1824
 WAL record construction: 49544
 dynahash: 7968
 MdSmgr: 7968
 dynahash: 16160
 dynahash: 103896
 ErrorContext: 7968
(gdb) quit
root@0ec98d20bda2:/# cat gdb-context
define sum_context_blocks
set $context = $arg0
set $block = ((AllocSet) $context)->blocks
set $size = 0
while ($block)
set $size = $size + (((AllocBlock) $block)->endptr - ((char *) $block))
set $block = ((AllocBlock) $block)->next
end
printf "%s: %d\n",((MemoryContext)$context)->name, $size
end

define walk_contexts
set $parent_$arg0 = ($arg1)
set $indent_$arg0 = ($arg0)
set $i_$arg0 = $indent_$arg0
while ($i_$arg0)
printf " "
set $i_$arg0 = $i_$arg0 - 1
end
sum_context_blocks $parent_$arg0
set $child_$arg0 = ((MemoryContext) $parent_$arg0)->firstchild
set $indent_$arg0 = $indent_$arg0 + 1
while ($child_$arg0)
walk_contexts $indent_$arg0 $child_$arg0
set $child_$arg0 = ((MemoryContext) $child_$arg0)->nextchild
end
end

walk_contexts 0 TopMemoryContext








[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux