Hi! I'm just wondering, I've got a table that is partitioned into monthly tables: media_downloads -> media_downloads_YYYYMM I\- id (primary key) \- created_on (timestamp criteria for the monthly table split) There are constraints upon the created_on column, all needed insert instead rules are defined too. One additional hardship is that id are not monotone against created_on, id1 < id2 does not imply created_on1 <= created_on2 :( The table contains basically almost 100M rows, and the number is growing. (the table will be about a 12GB pg_dump.) All relevant indexes (primary key id, index on created_on) are defined too. The good thing is, queries like all rows in the last 7 days work reasonable fast, the optimizer just checks the 1-2 last month tables. Using postgres 8.1.4-0ubuntu1, I've got to implement the following queries in a reasonable fast way: -- sequential reading of rows SELECT * FROM media_downloads WHERE id > 1000000 ORDER BY id LIMIT 100; Against the same monolithic table with about 16.5M rows, I'm getting a cost of 20.6 pages. (Index scan) Against the partitioned tables, I'm getting a cost of 5406822 pages. Now I understand, that without any additional conditions, postgresql needs to do the query for all subtables first, but explain against the subtables show costs of 4-5 pages. events=# explain select * from media_downloads where id >90000000 order by id limit 100; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=5406822.39..5406822.64 rows=100 width=1764) -> Sort (cost=5406822.39..5413639.50 rows=2726843 width=1764) Sort Key: public.media_downloads.id -> Result (cost=0.00..115960.71 rows=2726843 width=1764) -> Append (cost=0.00..115960.71 rows=2726843 width=1764) -> Seq Scan on media_downloads (cost=0.00..10.50 rows=13 width=1764) Filter: (id > 90000000) -> Index Scan using media_downloads_200510_pkey on media_downloads_200510 media_downloads (cost=0.00..3.75 rows=14 width=243) Index Cond: (id > 90000000) -> Index Scan using media_downloads_200511_pkey on media_downloads_200511 media_downloads (cost=0.00..72.19 rows=172 width=239) Index Cond: (id > 90000000) -> Index Scan using media_downloads_200512_pkey on media_downloads_200512 media_downloads (cost=0.00..603.64 rows=172 width=240) Index Cond: (id > 90000000) -> Index Scan using media_downloads_200601_pkey on media_downloads_200601 media_downloads (cost=0.00..19.33 rows=232 width=239) Index Cond: (id > 90000000) -> Index Scan using media_downloads_200602_pkey on media_downloads_200602 media_downloads (cost=0.00..56.82 rows=316 width=240) Index Cond: (id > 90000000) -> Index Scan using media_downloads_200603_pkey on media_downloads_200603 media_downloads (cost=0.00..18.88 rows=270 width=243) Index Cond: (id > 90000000) -> Index Scan using media_downloads_200604_pkey on media_downloads_200604 media_downloads (cost=0.00..1194.16 rows=939 width=298) Index Cond: (id > 90000000) -> Index Scan using media_downloads_200605_pkey on media_downloads_200605 media_downloads (cost=0.00..79.28 rows=672 width=326) Index Cond: (id > 90000000) -> Index Scan using media_downloads_200606_pkey on media_downloads_200606 media_downloads (cost=0.00..75.26 rows=1190 width=314) Index Cond: (id > 90000000) -> Index Scan using media_downloads_200607_pkey on media_downloads_200607 media_downloads (cost=0.00..55.29 rows=1238 width=319) Index Cond: (id > 90000000) -> Index Scan using media_downloads_200608_pkey on media_downloads_200608 media_downloads (cost=0.00..73.95 rows=1305 width=319) Index Cond: (id > 90000000) -> Index Scan using media_downloads_200609_pkey on media_downloads_200609 media_downloads (cost=0.00..144.10 rows=1575 width=324) Index Cond: (id > 90000000) -> Index Scan using media_downloads_200610_pkey on media_downloads_200610 media_downloads (cost=0.00..113532.57 rows=2718709 width=337) Index Cond: (id > 90000000) -> Seq Scan on media_downloads_200611 media_downloads (cost=0.00..10.50 rows=13 width=1764) Filter: (id > 90000000) -> Seq Scan on media_downloads_200612 media_downloads (cost=0.00..10.50 rows=13 width=1764) Filter: (id > 90000000) (37 rows) events=# explain select * from media_downloads_200610 where id >90000000 order by id limit 100; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.00..4.18 rows=100 width=337) -> Index Scan using media_downloads_200610_pkey on media_downloads_200610 (cost=0.00..113582.70 rows=2719904 width=337) Index Cond: (id > 90000000) (3 rows) Interestingly, if one reformulates the query like that: SELECT * FROM media_downloads WHERE id > 90000000 AND id < 90001000 ORDER BY id LIMIT 100; results in a reasonable cost of 161.5 pages. Now the above query is basically acceptable, as one iterate all rows this way, but now I need to know max(id) to know when to stop my loop: events=# explain select max(id) from media_downloads; QUERY PLAN -------------------------------------------------------------------------------------------------------------- Aggregate (cost=3676914.56..3676914.58 rows=1 width=4) -> Append (cost=0.00..3444211.85 rows=93081085 width=4) -> Seq Scan on media_downloads (cost=0.00..10.40 rows=40 width=4) -> Seq Scan on media_downloads_200510 media_downloads (cost=0.00..5615.84 rows=139884 width=4) -> Seq Scan on media_downloads_200511 media_downloads (cost=0.00..67446.56 rows=1724356 width=4) -> Seq Scan on media_downloads_200512 media_downloads (cost=0.00..66727.02 rows=1718302 width=4) -> Seq Scan on media_downloads_200601 media_downloads (cost=0.00..88799.91 rows=2321991 width=4) -> Seq Scan on media_downloads_200602 media_downloads (cost=0.00..121525.71 rows=3159571 width=4) -> Seq Scan on media_downloads_200603 media_downloads (cost=0.00..104205.40 rows=2701240 width=4) -> Seq Scan on media_downloads_200604 media_downloads (cost=0.00..342511.42 rows=9391242 width=4) -> Seq Scan on media_downloads_200605 media_downloads (cost=0.00..245167.39 rows=6724039 width=4) -> Seq Scan on media_downloads_200606 media_downloads (cost=0.00..430186.99 rows=11901499 width=4) -> Seq Scan on media_downloads_200607 media_downloads (cost=0.00..451313.72 rows=12380172 width=4) -> Seq Scan on media_downloads_200608 media_downloads (cost=0.00..474743.72 rows=13048372 width=4) -> Seq Scan on media_downloads_200609 media_downloads (cost=0.00..619711.52 rows=15754452 width=4) -> Seq Scan on media_downloads_200610 media_downloads (cost=0.00..426225.45 rows=12115845 width=4) -> Seq Scan on media_downloads_200611 media_downloads (cost=0.00..10.40 rows=40 width=4) -> Seq Scan on media_downloads_200612 media_downloads (cost=0.00..10.40 rows=40 width=4) (18 rows) events=# explain select max(id) from media_downloads_200610; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------- Result (cost=0.04..0.05 rows=1 width=0) InitPlan -> Limit (cost=0.00..0.04 rows=1 width=4) -> Index Scan Backward using media_downloads_200610_pkey on media_downloads_200610 (cost=0.00..475660.29 rows=12115845 width=4) Filter: (id IS NOT NULL) (5 rows) For me as a human, it's obvious, that max(media_downloads) == max(media_downloads_200612..media_downloads_200510). Any ideas how to make the optimizer handle partitioned tables more sensible? Andreas
Attachment:
signature.asc
Description: Dies ist ein digital signierter Nachrichtenteil