This is not a performance bug -- my query takes a reasonably long amount of time, but I would like to see if I can get this calculation any faster in my setup. I have a table: volume_id serial primary key switchport_id integer not null in_octets bigint not null out_octets bigint not null insert_timestamp timestamp default now() with indexes on volume_id, switchport_id, insert_timestamp. That is partitioned into about 3000 tables by the switchport_id (FK to a lookup table), each table has about 30 000 rows currently (a row is inserted every 5 minutes into each table). I have select queries that filter based on switchport_id and timestamp. Constraint exclusion is used with the switchport_id to get the right table and the insert_timestamp has an index on it (on each table). Any time the volume tables are queried it is to calculate the deltas between each in_octets and out_octets from the previous row (ordered by timestamp). The delta is used because the external system, where the data is retrieved from, will roll over the value sometimes. I have a function to do this calcuation: create or replace function traffic.get_delta_table(p_switchport_id integer, p_start_date date, p_end_date date) returns table( volume_id integer, insert_timestamp timestamp, out_delta bigint, out_rate bigint, out_rate_order bigint, in_delta bigint, in_rate bigint, in_rate_order bigint) as $$ declare begin -- we need to force pgsql to make a new plan for each query so it can -- use constraint exclusions on switchport id to determine the partition table to scan return query execute 'select t.volume_id, t.insert_timestamp, t.out_delta, t.out_delta * 8 / t.time_difference as out_rate, row_number() over (order by t.out_delta * 8 / t.time_difference) as out_rate_order, t.in_delta, t.in_delta * 8 / t.time_difference as in_rate, row_number() over(order by t.in_delta * 8 / t.time_difference) as in_rate_order from (select n.volume_id, n.insert_timestamp, extract(epoch from (n.insert_timestamp - lag(n.insert_timestamp,1,n.insert_timestamp) over(order by n.insert_timestamp)))::integer as time_difference, case when n.out_octets < lag(out_octets,1,n.out_octets) over(order by n.insert_timestamp) then n.out_octets else n.out_octets - lag(out_octets,1,n.out_octets) over(order by n.insert_timestamp) end as out_delta, case when n.in_octets < lag(in_octets,1,n.in_octets) over(order by n.insert_timestamp) then n.in_octets else n.in_octets - lag(in_octets,1,n.in_octets) over(order by n.insert_timestamp) end as in_delta from volume as n where n.insert_timestamp between $1 and $2 and n.switchport_id = '||p_switchport_id||' and in_octets != 0 and out_octets != 0 ) as t where time_difference is not null and time_difference != 0' using p_start_date, p_end_date; end; $$ language plpgsql; The query inside the function's plan: WindowAgg (cost=2269.62..2445.35 rows=6390 width=32) (actual time=7526.526..7531.855 rows=6622 loops=1) -> Sort (cost=2269.62..2285.60 rows=6390 width=32) (actual time=7526.497..7527.924 rows=6622 loops=1) Sort Key: (((t.in_delta * 8) / t.time_difference)) Sort Method: external sort Disk: 432kB -> WindowAgg (cost=1753.90..1865.72 rows=6390 width=32) (actual time=2613.593..2618.727 rows=6622 loops=1) -> Sort (cost=1753.90..1769.87 rows=6390 width=32) (actual time=2613.566..2614.550 rows=6622 loops=1) Sort Key: (((t.out_delta * 8) / t.time_difference)) Sort Method: quicksort Memory: 710kB -> Subquery Scan on t (cost=978.89..1350.00 rows=6390 width=32) (actual time=2582.254..2606.708 rows=6622 loops=1) Filter: ((t.time_difference IS NOT NULL) AND (t.time_difference <> 0)) -> WindowAgg (cost=978.89..1269.32 rows=6454 width=28) (actual time=2582.243..2596.546 rows=6623 loops=1) -> Sort (cost=978.89..995.03 rows=6454 width=28) (actual time=2582.120..2583.172 rows=6623 loops=1) Sort Key: n.insert_timestamp Sort Method: quicksort Memory: 710kB -> Result (cost=8.87..570.49 rows=6454 width=28) (actual time=1036.720..2576.755 rows=6623 loops=1) -> Append (cost=8.87..570.49 rows=6454 width=28) (actual time=1036.718..2574.719 rows=6623 loops=1) -> Bitmap Heap Scan on volume n (cost=8.87..12.90 rows=1 width=28) (actual time=0.055..0.055 rows=0 loops=1) Recheck Cond: ((switchport_id = 114) AND (insert_timestamp >= '2011-02-01 00:00:00'::timestamp without time zone) AND (insert_timestamp <= '2011-03-02 00:00:00'::timestamp without time zone)) Filter: ((in_octets <> 0) AND (out_octets <> 0)) -> BitmapAnd (cost=8.87..8.87 rows=1 width=0) (actual time=0.053..0.053 rows=0 loops=1) -> Bitmap Index Scan on volume_parent_switchport_id_idx (cost=0.00..4.30 rows=7 width=0) (actual time=0.045..0.045 rows=0 loops=1) Index Cond: (switchport_id = 114) -> Bitmap Index Scan on volume_parent_insert_timestamp_idx (cost=0.00..4.32 rows=7 width=0) (never executed) Index Cond: ((insert_timestamp >= '2011-02-01 00:00:00'::timestamp without time zone) AND (insert_timestamp <= '2011-03-02 00:00:00'::timestamp without time zone)) -> Bitmap Heap Scan on volume_114 n (cost=142.40..557.59 rows=6453 width=28) (actual time=1036.662..2573.116 rows=6623 loops=1) Recheck Cond: ((insert_timestamp >= '2011-02-01 00:00:00'::timestamp without time zone) AND (insert_timestamp <= '2011-03-02 00:00:00'::timestamp without time zone)) Filter: ((in_octets <> 0) AND (out_octets <> 0) AND (switchport_id = 114)) -> Bitmap Index Scan on volume_114_insert_timestamp_idx (cost=0.00..140.78 rows=6453 width=0) (actual time=981.034..981.034 rows=6623 loops=1) Index Cond: ((insert_timestamp >= '2011-02-01 00:00:00'::timestamp without time zone) AND (insert_timestamp <= '2011-03-02 00:00:00'::timestamp without time zone)) Total runtime: 7567.261 ms This ends up taking anywhere from 300ms to 7000ms (usually its around 300-400ms) and returns about 8000 rows. To get the 95th percentile its a couple simple selects after putting the result set from the above function into a temporary table: create temporary table deltas on commit drop as select * from get_delta_table(p_switchport_id, p_start_date, p_end_date); select round(count(volume_id) * 0.95) into v_95th_row from deltas; select in_rate into v_record.in_95th from deltas where in_rate_order = v_95th_row; select out_rate into v_record.out_95th from deltas where out_rate_order = v_95th_row; select sum(in_delta), sum(out_delta) into v_record.in_total, v_record.out_total from deltas; Unfortunately using a temporary table means that I cannot run this query on the read-only slave, but I can't see a way around using one. The master has 3000 inserts running against it every 5 minutes -- which used to be every 1 minute but the space and time for calculating 5x the current number of rows was not worth it. My server has 1GB of memory is running Red Hat and the only daemon is Postgres: effective cache size is 768MB shared buffers are 256MB work_mem is 2MB (I changed it when the explain analyze was showing 1.5MB used for an on disk sort) max locks per transaction is 3000 (I changed it when I started getting the error that suggests I change the max locks per transaction) Any ideas on speeding this up? Thanks, Landreville -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance