Re: Huge Data sets, simple queries

"Jeffrey W. Baker" <jwbaker@xxxxxxx> · Fri, 27 Jan 2006 22:50:00 -0800

On Fri, 2006-01-27 at 20:23 -0500, Mike Biamonte wrote:
> 
> Does anyone have any experience with extremely large data sets?
> I'm mean hundreds of millions of rows.

Sure, I think more than a few of us do.  Just today I built a summary
table from a 25GB primary table with ~430 million rows.  This took about
45 minutes.

> The queries I need to run on my 200 million transactions are relatively
> simple:
> 
>    select month, count(distinct(cardnum)) count(*), sum(amount) from
> transactions group by month;
> 
> This query took 18 hours on PG 8.1 on a Dual Xeon, RHEL3, (2.4 Kernel) with
> RAID-10 (15K drives)
> and 12 GB Ram.  I was expecting it to take about 4 hours - based on some
> experience with a
> similar dataset on a different machine (RH9, PG7.3 Dual Xeon, 4GB RAM,
> Raid-5 10K drives)

Possibly the latter machine has a faster I/O subsystem.  How large is
the table on disk?

>   This machine is COMPLETELY devoted to running these relatively simple
> queries one at a
> time. (No multi-user support needed!)    I've been tooling with the various
> performance settings:
> effective_cache at 5GB, shared_buffers at 2 GB, workmem, sortmem at 1 GB
> each.
> ( Shared buffers puzzles me a it bit - my instinct says to set it as high as
> possible,
> but everything I read says that "too high" can hurt performance.)
> 
>    Any ideas for performance tweaking in this kind of application would be
> greatly appreciated.
> We've got indexes on the fields being grouped, 
> and always vacuum analzye
> after building them.

Probably vacuum makes no difference.

>    It's difficult to just "try" various ideas because each attempt takes a
> full day to test.  Real
> experience is needed here!

Can you send us an EXPLAIN of the query?  I believe what you're seeing
here is probably:

Aggregate
+-Sort
  +-Sequential Scan

or perhaps:

Aggregate
+-Index Scan

I have a feeling that the latter will be much faster.  If your table has
been created over time, then it is probably naturally ordered by date,
and therefore also ordered by month.  You might expect a Sequential Scan
to be the fastest, but the Sort step will be a killer.  On the other
hand, if your table is badly disordered by date, the Index Scan could
also be very slow.

Anyway, send us the query plan and also perhaps a sample of vmstat
during the query.

For what it's worth, I have:

effective_cache_size            | 700000
cpu_tuple_cost                  | 0.01
cpu_index_tuple_cost            | 0.001
random_page_cost                | 3
shared_buffers                  | 50000
temp_buffers                    | 1000
work_mem                        | 1048576 <= for this query only

And here's a few lines from vmstat during the query:

procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
 2  1     76  43476  94916 7655148    0    0 78800     0 1662   788 68 12  0 20
 1  1     76  45060  91196 7658088    0    0 78028     0 1639   712 71 11  0 19
 2  0     76  44668  87624 7662960    0    0 78924    52 1650   736 69 12  0 19
 2  0     76  45300  83672 7667432    0    0 83536    16 1688   768 71 12  0 18
 1  1     76  45744  80652 7670712    0    0 84052     0 1691   796 70 12  0 17

That's about 80MB/sec sequential input, for comparison purposes.

-jwb