On Sun, 13 Dec 2015 20:09:04 +0100 Gerhard Wiesinger <lists@xxxxxxxxxxxxx> wrote: > On 13.12.2015 18:17, Tom Lane wrote: > > Gerhard Wiesinger <lists@xxxxxxxxxxxxx> writes: > >>> Mem: 7814M Active, 20G Inact, 2982M Wired, 232M Cache, 1661M Buf, 30M Free > >>> Swap: 512M Total, 506M Used, 6620K Free, 98% Inuse > >> OK, but why do we then get: kernel: swap_pager_getswapspace(4): failed? > > Just judging from the name of the function, I would bet this is a direct > > result of having only 512M of swap configured. As Bill already pointed > > out, that's a pretty useless choice on a system with 32G of RAM. As soon > > as the kernel tries to push out any significant amount of idle processes, > > it's gonna be out of swap space. The numbers you show above prove that > > it is almost out of free swap already. > > The system wasn't designed by me, I wouldn't do it either that way. Does > swapoff help? FreeBSD and Linux (and most modern OS) are designed to have swap, and usually more swap than RAM. I have never heard a good reason for not using swap, and the reasons I _have_ heard have always been by people misinformed about how the OS works. If someone has a _good_ explanation for why you wouldn't want any swap on a DB server, I'd love to hear it; but everything I've heard up till now has been speculation based on misinformation. IOW: no, you should not turn swap off, you should instead allocate the appropriate amount of swap space. > > Also, while that 20G of "inactive" pages may be candidates for reuse, > > they probably can't actually be reused without swapping them out ... > > and there's noplace for that data to go. > > There is no log in syslog (where postgres log) when > swap_pager_getswapspace is logged. > > But why do we have 20G of Inactive pages? They are still allocated by > kernel or user space. As you can see below (top output) NON Postgres > processes are around 9G in virtual size, resident even lower. The system > is nearly idle, and the queries typically aren't active after one second > agin. Therefore where does the rest of the 11G of Inactive pages come > from (if it isn't a Postgres/FreeBSD memory leak)? > I read that Postgres has it's own memory allocator: > https://www.reddit.com/r/programming/comments/18zija/github_got_30_better_performance_using_tcmalloc/ > Might that be an issue with double allocation/freeing and the "cheese > hole" topic with memory fragmentation? If there were a memory leak in either FreeBSD or Postgres of the seriousness you're describing that were as easy to trigger as you claim, I would expect the mailing lists and other support forums to be exploding in panic. Notice that they are not. Also, I still don't see _ANY_ evidence of a leak. I see evidence that something is happening that is trying to allocate a LOT of RAM, that isn't available on your system; but that's not the same as a leak. > https://www.opennet.ru/base/dev/fbsdvm.txt.html > inactive pages not actively used by programs which are > dirty and (at some point) need to be written > to their backing store (typically disk). > These pages are still associated with objects and > can be reclaimed if a program references them. > Pages can be moved from the active to the inactive > queue at any time with little adverse effect. > Moving pages to the cache queue has bigger > consequences (note 1) Correct, but, when under pressure, the system _will_ recycle those pages to be available. Tom might be correct in that the system thinks they are inactive because it could easily push them out to swap, but then it can't _actually_ do that because you haven't allocated enough swap, but that doesn't match my understanding of how inactive is used. A question of that detail would be better asked on a FreeBSD forum, as the differences between different VM implementations can be pretty specific and technical. [snip] > Mem: 8020M Active, 19G Inact, 3537M Wired, 299M Cache, 1679M Buf, 38M Free > Swap: 512M Total, 501M Used, 12M Free, 97% Inuse > > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND > 77941 pgsql 5 20 0 7921M 7295M usem 7 404:32 10.25% > postgres > 79570 pgsql 1 20 0 7367M 6968M sbwait 6 4:24 0.59% postgres [snip about 30 identical PG processes] > 32387 myusername 9 20 0 980M 375M uwait 5 69:03 1.27% node [snip similar processes] > 622 myusername 1 20 0 261M 3388K kqread 3 41:01 0.00% nginx [snip similar processes] Wait ... this is a combined HTTP/Postgres server? You didn't mention that earlier, and it's kind of important. What evidence do you have that Postgres is actually the part of this system running out of memory? I don't see any such evidence in any of your emails, and (based on experience) I find it pretty likely that whatever is running under node is doing something in a horrifically memory-inefficient manner. Since you mention that you see nothing in the PG logs, that makes it even more likely (to me) that you're looking entirely in the wrong place. I'd be willing to bet a steak dinner that if you put the web server on a different server than the DB, that the memory problems would follow the web server and not the DB server. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general