Re: pg_dump performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Dec 27, 2007 at 01:14:25PM +0000, Gregory Stark wrote:
> "Jared Mauch" <jared@xxxxxxxxxxxxxxx> writes:
> 
> > 	pg_dump is utilizing about 13% of the cpu and the
> > corresponding postgres backend is at 100% cpu time.
> > (multi-core, multi-cpu, lotsa ram, super-fast disk).
> >...
> > 	pg8.3(beta) with the following variances from default
> >
> > checkpoint_segments = 300        # in logfile segments, min 1, 16MB each
> > effective_cache_size = 512MB    # typically 8KB each
> > wal_buffers = 128MB                # min 4, 8KB each
> > shared_buffers = 128MB            # min 16, at least max_connections*2, 8KB each
> > work_mem = 512MB                 # min 64, size in KB
> 
> Fwiw those are pretty unusual numbers. Normally work_mem is much smaller than
> shared_buffers since you only need one block of memory for shared buffers and
> work_mem is for every query (and every sort within those queries). If you have
> ten queries running two sorts each this setting of work_mem could consume 5GB.

	I'd still have lots of ram left :)

	I'm dealing with normal query results that end up matching 5-10 million
rows based on the index (starttime) not counting the filter afterwards.  Each
backend rarely makes it over 256m.

> Raising shared buffers could improve your pg_dump speed. If all the data is in
> cache it would reduce the time spend moving data between filesystem cache and
> postgres shared buffers.

	I doubt it's all in cache, but I can look at this.  I did not do a
lot of fine tuning of numbers, just enough to get past the defaults and have
an acceptable amount of performance.

> What made you raise wal_buffers so high? I don't think it hurts but that's a
> few orders of magnitude higher than what I would expect to help.

	I'm adding chunks of ~1.2m rows every other minute.  Once I increase
my data collection pool, this will go up to around [1]2-3m rows or so.  I
found having higher wal and checkpoint helped.  I didn't spend a lot of time
tweaking these options.  Is there some way you know to determine high 
watermark numbers for what is being used?

	- Jared

[1] - I am concerned that with my 'insert' speed being around 100k/sec
      and raw pg_dump speed being around 182k/sec i will start getting data
      faster than can be stored and postprocessed.

-- 
Jared Mauch  | pgp key available via finger from jared@xxxxxxxxxxxxxxx
clue++;      | http://puck.nether.net/~jared/  My statements are only mine.

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match

[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux