Re: Query performance problems with partitioned tables

"Merlin Moncure" <mmoncure@xxxxxxxxx> · Fri, 4 May 2007 13:10:16 -0400

On 5/4/07, Scott Marlowe <smarlowe@xxxxxxxxxxxxxxxxx> wrote:
On Thu, 2007-05-03 at 21:37, Merlin Moncure wrote:
> On 5/3/07, Fei Liu <fei.liu@xxxxxxxxxxxxxxx> wrote:
> > Hello, Andreas, I too am having exactly the same issue as you do.
> > Comparing my partitioned and plain table performance, I've found that
> > the plain tables perform about 25% faster than partitioned table. Using
> > 'explain select ...', I see that constraints are being used so in
> > partitioned tables fewer rows are examined. But still partitioned tables
> > are 25% slower, what a let down.
>
> That's a little bit harsh.  The main use of partitioning is not to
> make the table faster but to make the maintenance easier.  When
> constraint exclusion works well for a particular query you can get a
> small boost but many queries will break down in a really negative way.
>  So, you are sacrificing flexibility for easier maintenance.  You have
> to really be careful how you use it.
>
> The best case for partitioning is when you can logically divide up
> your data so that you really only have to deal with one sliver of it
> at a time...for joins and such.  If the OP could force the constraint
> exclusion (maybe by hashing the timestamp down to a period and using
> that for where clause), his query would be fine.  The problem is it's
> not always easy to do that.

Agree++

I've been testing partitioning for a zip code lookup thing that was
posted here earlier, and I partitioned a 10,000,000 row set into about
400 partitions.  I found that selecting a range of areas defined by x/y
coordinates was faster without any indexes.  The same selection with one
big table and one big (x,y) index took 3 to 10 seconds typically, same
select against the partitions with no indexes took 0.2 to 0.5 seconds.

I was thinking about that problem....one approach I was playing with
was to normalize the 10mm table to zipcode (chopping off + 4) and then
doing bounding box ops on the zipcode (using earthdistance/gist) table
and also the detail table using tradictional tactics or gist.  I think
this would give reasonable performance without partitioning (10mm
records doesn't scare me anymore!).  If the records are frequently
updated you may want to TP anways though do to (pre-hot) vacuum
issues.

merlin