On 09/18/2013 11:34 AM, Shaun Thomas wrote:
On 09/18/2013 07:45 AM, "Gabriel E. Sánchez Martínez" wrote:
Now to the questions. Would constraint exclusion work in a tree
fashion to prune tables from a query plan?
Constraint exclusion works by checking the constraints themselves. So
if your constraints are based on date, they will always work as
expected. Assuming they're generated, you could easily check down to
the day level by >= desired day, and < next day.
With that in mind, if you queried a specific day, the query would
remove every partition but the relevant day (or range of days).
I understand that, but I want to know whether all the child day tables
would be checked for exclusion, or if instead entire groups of tables
can be excluded by checking the year and month first, and then checking
the days only for the queried year-month. For 10 years of data, that is
a difference between checking 10 years + 12 months + 30 days = 52
partitions vs. checking all 3652 partitions.
Example: Querying for data of 2012-01-01.
Would a query plan prune tables in a tree fashion, as follows?
#
Check partitions 2000, 2001, 2002, ... , 2013. 14 checks. All years
but 2012 are excluded. Approximately 4748 partitions excluded with only
14 constraint exclusion checks.
Check partitions 2012-01, 2012-02, ... , 2012-12. 12 checks. All
months but 2012-01 are excluded. 325 partitions excluded with only 12
constraint exclusion checks.
Check partitions 2012-01-01, 2012-01-02, ... , 2012-01-31. 31 checks.
All days but 2012-01-01 are excluded. 30 partitions excluded with 31
constraint exclusion checks.
Sequential scan on partition 2012-01-01.
#
Or would it have to check every child partition, as follows?
#
Check partition 2000-01-01, 2000-01-02, ... , 2013-09-18. 5010 checks.
All days but 2012-01-01 are excluded. 5009 partitions excluded with
5010 constraint exclusion checks.
Sequential scan on partition 2012-01-01.
#
The former would be much more efficient.
One can see that with a decade of data stored, there would be
approximately 3652 partitions. The documentation discourages having
this many partitions.
And you shouldn't. In our partitions, we try to keep it around one
partition per month. I've recently written a library that eschews
triggers in favor of a nightly movement job to redistribute data, and
it can handle partition granularity on a sliding scale. The default is
one month, but my tests show it works properly down to 1 day. Still,
the lowest I'd use it is probably 1 week. That would still drastically
reduce your amount of estimated partitions.
However you decide to do it, don't optimize prematurely. Monthly
partitions are enough for us and our 130GB table with 250M rows, and
that's for a single year.
--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general