Re: Transparent table partitioning in future version of PG?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 5/7/09 1:54 AM, "Simon Riggs" <simon@xxxxxxxxxxxxxxx> wrote:

> 
> 
> On Thu, 2009-05-07 at 10:56 +0800, Craig Ringer wrote:
>> Tom Lane wrote:
>> 
>> I also expect that in the future there will be demand for striping data
>> across multiple partitions in different tablespaces to exploit
>> in-parallel scanning (when/if supported) for better I/O utilization in
>> multiple-disk-array situations. For example, partitioning on
>> "MOD(id,10)" across 10 separate volumes, and firing off 10 concurrent
>> scans, one per partition, to satisfy a query.
> 
> That's a good summary. It has already been documented and discussed, but
> saying it again and again is the best way to get this across.
> 
> You've highlighted that partitioning is a feature with many underlying
> requirements: infrequent access to data (frequently historical),

Actually, infrequent access is not a requirement.  It is a common
requirement however.

Take for instance, a very large set of data that contains an integer column
'type_id' that has about 200 distinct values.  The data is accessed with a
strict 'type_id = X' requirement 99.9% of the time.  If this was one large
table, then scans of all sorts become much more expensive than if it is
partitioned on 'type_id'.  Furthermore, partitioning on type_id removes the
requirement to even index on this value.  Statistics on each partition may
vary significantly, and the plannner can thus adapt to changes in the data
per value of type_id naturally.

The raw need is not "infrequent access" but highly partitioned access.  It
doesn't matter if your date-partitioned data is accessed evenly across all
dates or skewed to the most frequent -- it matters that you are almost
always accessing by small date ranges.

> striping for parallelism and getting around RDBMS flaws (if any). We
> must be careful to implement each requirement in full, yet separately,
> so we don't end up with 60% functionality in each case by delivering an
> average or least common denominator solution.
> 
> --
>  Simon Riggs           www.2ndQuadrant.com
>  PostgreSQL Training, Services and Support
> 
> 
> --
> Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
> 


-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux