Re: Do I have a hardware or a software problem?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Dec 10, 2012 at 2:51 PM, Niels Kristian Schjødt
<nielskristian@xxxxxxxxxxxxx> wrote:

> synchronous_commit = off
>
> The pg_xlog folder has been moved onto the SSD array (md3), and symlinked
> back into the postgres dir.

With synchronous_commit = off, or with large transactions, there is
probably no advantage to moving those to SSD.


> 2)
> When the database is loaded like this, I see a lot of queries talking up to
> 1000 times as long, as they would when the database is not loaded so
> heavily.

What kinds of queries are they?  single-row look-ups, full table scans, etc.


>
> #### Notes and thoughts
> ##############################################################################
>
> As you can see, even though I have moved the pg_xlog folder to the SSD array
> (md3) the by far largest amount of writes still goes to the regular HDD's
> (md2), which puzzles me - what can that be?

Every row you insert or non-HOT update has to do maintenance on all
indexes of that table.  If the rows are not inserted/updated in index
order, this means you every row inserted/updated dirties a randomly
scattered 8KB for each of the indexes.  If you have lots of indexes
per table, that adds up fast.

The fact that there is much more writing than reading tells me that
most of your indexes are in RAM.  The amount of index you are rapidly
reading and dirtying is large enough to fit in RAM, but is not large
enough to fit in shared_buffers + kernel's dirty-buffer comfort level.
 So you are redirtying the same blocks over and over, PG is
desperately dumping them to the kernel (because shared_buffers it too
small to hold them) and the kernel is desperately dumping them to
disk, because vm.dirty_background_ratio is so low.  There is little
opportunity for write-combining, because they don't sit in memory long
enough to accumulate neighbors.

How big are your indexes?

You could really crank up shared_buffers or vm.dirty_background_ratio,
but doing so might cause problems with checkpoints stalling and
latency spikes.  That would probably not be a problem during the
night, but could be during the day.

Rather than moving maintenance to the day and hoping it doesn't
interfere with normal operations, I'd focus on making night-time
maintenance more efficient, for example by dropping indexes (either
just at night, or if some indexes are not useful, just get rid of them
altogether), or cranking up shared_buffers at night, or maybe
partitioning or look into pg_bulkload.

> From stat 3) (the iostat) I notice that the SSD's doesn't seem to be
> something near fully utilized - maybe something else than just pg_xlog could
> be moved her?

I don't know how big each disk is, or how big your various categories
of data are.  Could you move everything to SSD?  Could you move all
your actively updated indexes there?

Or, more fundamentally, it looks like you spent too much on CPUs (86%
idle) and not nearly enough on disks.  Maybe you can fix that for less
money than it will cost you in your optimization time to make the best
of the disks you already have.

Cheers,

Jeff


-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux