Re: High CPU usage after partitioning

Merlin Moncure <mmoncure@xxxxxxxxx> · Tue, 22 Jan 2013 08:04:39 -0600

On Tue, Jan 22, 2013 at 7:34 AM, rudi <rudolone@xxxxxxxxx> wrote:
> Every query has been carefully optimized, child tables are indexed. The
> table(s) has a UNIQUE index on ("date_taken", "device_id") and "date_taken"
> is the partitioning column (one partition per year).
> There are few well known access path to this table: INSERTs (40-50.000 each
> day), SELECTs on a specific device_id AND on a specific day.
>
> BUT, I discovered an access path used by a process every few secs. to get
> the last log for a given device, and this query became really slow after
> partitioning:
>
> Result  (cost=341156.04..341182.90 rows=4 width=86) (actual
> time=1132.326..1132.329 rows=1 loops=1)
>   InitPlan 1 (returns $0)
>     ->  Aggregate  (cost=341156.03..341156.04 rows=1 width=8) (actual
> time=1132.295..1132.296 rows=1 loops=1)
>           ->  Append  (cost=0.00..341112.60 rows=17371 width=8) (actual
> time=45.600..1110.057 rows=19016 loops=1)
>                 ->  Seq Scan on sb_logs  (cost=0.00..0.00 rows=1 width=8)
> (actual time=0.000..0.000 rows=0 loops=1)
>                       Filter: (device_id = 901)
>                 ->  Index Scan using
> sb_logs_2012_on_date_taken_and_device_id on sb_logs_2012 sb_logs
> (cost=0.00..319430.51 rows=16003 width=8) (actual time=45.599..1060.143
> rows=17817 loops=1)
>                       Index Cond: (device_id = 901)
>                 ->  Index Scan using
> sb_logs_2013_on_date_taken_and_device_id on sb_logs_2013 sb_logs
> (cost=0.00..21663.39 rows=1363 width=8) (actual time=0.022..47.661 rows=1199
> loops=1)
>                       Index Cond: (device_id = 901)
>                 ->  Bitmap Heap Scan on sb_logs_2014 sb_logs
> (cost=10.25..18.71 rows=4 width=8) (actual time=0.011..0.011 rows=0 loops=1)
>                       Recheck Cond: (device_id = 901)
>                       ->  Bitmap Index Scan on
> sb_logs_2014_on_date_taken_and_device_id  (cost=0.00..10.25 rows=4 width=0)
> (actual time=0.008..0.008 rows=0 loops=1)
>                             Index Cond: (device_id = 901)
>   ->  Append  (cost=0.00..26.86 rows=4 width=86) (actual
> time=1132.325..1132.328 rows=1 loops=1)
>         ->  Seq Scan on sb_logs  (cost=0.00..0.00 rows=1 width=90) (actual
> time=0.002..0.002 rows=0 loops=1)
>               Filter: ((device_id = 901) AND (date_taken = $0))
>         ->  Index Scan using sb_logs_2012_on_date_taken_and_device_id on
> sb_logs_2012 sb_logs  (cost=0.00..10.20 rows=1 width=90) (actual
> time=1132.314..1132.314 rows=0 loops=1)
>               Index Cond: ((date_taken = $0) AND (device_id = 901))
>         ->  Index Scan using sb_logs_2013_on_date_taken_and_device_id on
> sb_logs_2013 sb_logs  (cost=0.00..8.39 rows=1 width=91) (actual
> time=0.007..0.008 rows=1 loops=1)
>               Index Cond: ((date_taken = $0) AND (device_id = 901))
>         ->  Index Scan using sb_logs_2014_on_date_taken_and_device_id on
> sb_logs_2014 sb_logs  (cost=0.00..8.27 rows=1 width=72) (actual
> time=0.002..0.002 rows=0 loops=1)
>               Index Cond: ((date_taken = $0) AND (device_id = 901))
> Total runtime: 1132.436 ms
>
> I must find a better way to get that information, but I wonder if there
> could be a better plan. The same query over a table with the same structure
> but not partitioned gives far better plan:
>
> Index Scan using index_iv_logs_on_date_taken_and_device_id on iv_logs
> (cost=12.35..21.88 rows=1 width=157) (actual time=0.065..0.066 rows=1
> loops=1)
>   Index Cond: ((date_taken = $1) AND (device_id = 1475))
>   InitPlan 2 (returns $1)
>     ->  Result  (cost=12.34..12.35 rows=1 width=0) (actual time=0.059..0.059
> rows=1 loops=1)
>           InitPlan 1 (returns $0)
>             ->  Limit  (cost=0.00..12.34 rows=1 width=8) (actual
> time=0.055..0.056 rows=1 loops=1)
>                   ->  Index Scan Backward using
> index_iv_logs_on_date_taken_and_device_id on iv_logs  (cost=0.00..261052.53
> rows=21154 width=8) (actual time=0.055..0.055 rows=1 loops=1)
>                         Index Cond: ((date_taken IS NOT NULL) AND (device_id
> = 1475))
> Total runtime: 0.110 ms

let's see the query -- it's probably written in such a way so as to
not be able to be optimized through CE.

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance