Re: understanding postgres issues/bottlenecks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Title: Re: understanding postgres issues/bottlenecks

> Sequential read performance means precisely squat for most database
> loads.  

Depends on the database workload.  Many queries for me may scan 50GB of data for aggregation.
Besides, it is a good test for making sure your RAID card doesn’t suck.  Especially running tests with sequential access CONCURRENT with random access.
A good tuned raid setup will be able to handle a good chunk of sequential access while doing random reads  concurrently.  A bad one will grind to a halt.
The same can be said for concurrent writes and fsyncs with concurrent reads.  Bad cards tend to struggle with this, good ones don’t.

$ sar -b
12:00:01 AM       tps      rtps      wtps   bread/s   bwrtn/s
01:10:01 AM   1913.22   1903.74      9.48 561406.70    326.67
01:20:02 AM   2447.71   2439.97      7.74 930357.08    148.86
01:30:01 AM   1769.77   1740.41     29.35 581015.86   3729.37
01:40:01 AM   1762.05   1659.06    102.99 477730.70  26137.96

And disk utilization did not go past 85% or so during the peak load, usually much less (8 cores, 12.5% would reflect a CPU).

12:00:01 AM       CPU     %user     %nice   %system   %iowait    %steal     %idle
01:10:01 AM       all     47.92      0.00     12.92     10.22      0.00     28.94
01:20:02 AM       all     67.97      0.00     17.93      3.47      0.00     10.63
01:30:01 AM       all     46.67      0.00     10.60      7.43      0.00     35.29
01:40:01 AM       all     59.22      0.03      9.88      5.67      0.00     25.21

The workload regularly bursts to 900MB/sec with concurrent sequential scans.


> The dell stuff is ok....decent RAID 5 performance and mediocre
> raid 10.  Unfortunately switching the disks to jbod and going software
> raid doesn't seem to help much.  The biggest problem with dell
> hardware that I see is that overflowing the raid cache causes the
> whole system to spectacularly grind to a halt, causing random delays.

The Adaptec stuff doesn’t have the issues with cache overflow.  For pure random access stuff the Dell Perc 6 is pretty good, but mix read/write it freaks out and has inconsistent performance.  A PERC 6 does perform better than a 3Ware 9650 for me though.  Those are both on my crap list, with 3Ware 9550 and PERC 5 both much worse.
Both got about 200 iops per drive on random access.


> To the OP, it looks like you are getting about 300 or so tps out of
> sdc (80% read), which is where I'm assuming the data is.  I'm guessing
> most of that is random traffic.  Here's the bad news: while this is on
> the low side for a 6 disk raid 10 7200 rpm, it's probably about what
> your particular hardware can do.  I have some general suggestions for
> you:
> *) upgrade hardware: more/faster disks, etc
> *) disable fsync (dangerous!) can risk data loss, but maybe you have
> redundancy built in a different place.  This will let linux reorganize
> i/o on top of what the hardware is doing.
> *) upgrade to postgres 8.3.  Numerous efficiency advantages, and has
> the synchronous_commit setting, which is 'fsync lite'...most of the
> advantages and a lot less risk.
> *) tune the app
>
> merlin

Agree with all of the above.
The xlogs are on sdb, which is not I/O bound, so I am not sure how much changing fsync will help.
I second upgrading to 8.3 which is generally faster and will reduce random i/o if any sequential scans are kicking out random access data from shared_buffers.

If there is budget for an upgrade, how big is the data set, and how much will it grow?
For  $1200 get two Intel X25-M SSD’s and all random iops issues will be gone (8k iops in raid 1).  Double that if 4 drives in raid 10.  Unfortunately, each pair only stores 80GB.  But for many, that is plenty.

[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux