Re: pg_restore out of memory

Miguel Ramos <org.postgresql@xxxxxxxxxxxxxxxxx> · Wed, 13 Jul 2016 22:07:52 +0100

A Ter, 12-07-2016 às 13:08 +0000, Sameer Kumar escreveu:
> On Tue, 12 Jul 2016, 7:25 p.m. Miguel Ramos,
> <org.postgresql@xxxxxxxxxxxxxxxxx> wrote:
> > I found two relevant threads on the mailing-lists.
> > The most recent one sugested that postgresql was being configured
> > to use
> > more memory than what's available.
> > The older one sugested that the system limits on the size of the
> > data or
> > stack segments were lower than required.
> > 
> > So here are some server parameters (relevant or otherwise):
> > 
> >  > max_connections = 100
> >  > shared_buffers = 4GB  -- 25% of RAM
> >  > temp_buffers = 32MB  -- irrelevant?
> >  > work_mem = 64MB
> >  > maintenance_work_mem = was 1G lowered to 256M then 64M
> 
> 
> Why did you lower it? I think increasing it should help better. But
> 1GB seems like fine.

The advise was on that thread and maybe the problem was very different.
The idea I get is that PostgreSQL can always find a way to do its work,
maybe using an out of core algorithm.

If you tell it to use a lot of memory, then it will try to use RAM and
then it really may run out of memory.

So, basically, increasing the memory available is a performance
improvement, if you feel safe that the memory really is available.

But maybe that logic applies only to work_mem...
And it's also work_mem that is difficult to bound, according to the
manual.
I don't really know...

> >  > wal_buffers = -1  -- should mean 1/32 of shared_buffers = 128MB
> 
> Increase this during the restore, may be 512MB

I retain the advise, but now I have posted the log messages to the
list.

> >  > checkpoint_segments = 64  -- WAL segments are 16MB
> >  > effective_cache_size = 8GB  -- irrelevant?
> > 
> > 
> > I suspect that the restore fails when constructing the indices.
> > After
> > the process is aborted, the data appears to be all or most there,
> > but no
> > indices.
> 
> What is logged in database log files? Have you checked that?

This time I collected the logs.
I posted the 10 relevant lines as a reply to another message.
I'll repeat only the ERROR line here:

Jul 13 21:08:06 ema postgres[97889]: [868-1] ERROR:  unexpected message type 0x58 during COPY from stdin

> What are your vm.dirty_ratio and vm.dirty_background_ratio? I think
> reducing them may help. But can not really say what exactly would
> help unless you are able to get the error source in db logs

This is a FreeBSD server.
I'm not really sure what the equivalent would be.
Also, I don't think tunning the VM would help.

This is quite a deterministic abort, 12-13 hours after the beginning of
the restore, and does not change much whether it is done during the
night or during the day with 10 people working intensively.

Thanks,

-- Miguel

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general