Search Postgresql Archives

Re: nested partitioning

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 09/18/2013 11:34 AM, Shaun Thomas wrote:
On 09/18/2013 07:45 AM, "Gabriel E. Sánchez Martínez" wrote:

Now to the questions.  Would constraint exclusion work in a tree
fashion to prune tables from a query plan?

Constraint exclusion works by checking the constraints themselves. So if your constraints are based on date, they will always work as expected. Assuming they're generated, you could easily check down to the day level by >= desired day, and < next day.

With that in mind, if you queried a specific day, the query would remove every partition but the relevant day (or range of days).
I understand that, but I want to know whether all the child day tables would be checked for exclusion, or if instead entire groups of tables can be excluded by checking the year and month first, and then checking the days only for the queried year-month. For 10 years of data, that is a difference between checking 10 years + 12 months + 30 days = 52 partitions vs. checking all 3652 partitions.

Example: Querying for data of 2012-01-01.

Would a query plan prune tables in a tree fashion, as follows?

#
Check partitions 2000, 2001, 2002, ... , 2013. 14 checks. All years but 2012 are excluded. Approximately 4748 partitions excluded with only 14 constraint exclusion checks.

Check partitions 2012-01, 2012-02, ... , 2012-12. 12 checks. All months but 2012-01 are excluded. 325 partitions excluded with only 12 constraint exclusion checks.

Check partitions 2012-01-01, 2012-01-02, ... , 2012-01-31. 31 checks. All days but 2012-01-01 are excluded. 30 partitions excluded with 31 constraint exclusion checks.

Sequential scan on partition 2012-01-01.
#

Or would it have to check every child partition, as follows?

#
Check partition 2000-01-01, 2000-01-02, ... , 2013-09-18. 5010 checks. All days but 2012-01-01 are excluded. 5009 partitions excluded with 5010 constraint exclusion checks.

Sequential scan on partition 2012-01-01.
#

The former would be much more efficient.


One can see that with a decade of data stored, there would be
approximately 3652 partitions. The documentation discourages having
this many partitions.

And you shouldn't. In our partitions, we try to keep it around one partition per month. I've recently written a library that eschews triggers in favor of a nightly movement job to redistribute data, and it can handle partition granularity on a sliding scale. The default is one month, but my tests show it works properly down to 1 day. Still, the lowest I'd use it is probably 1 week. That would still drastically reduce your amount of estimated partitions.

However you decide to do it, don't optimize prematurely. Monthly partitions are enough for us and our 130GB table with 250M rows, and that's for a single year.





--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux