Search Postgresql Archives

Partitioned tables as a poor mans columnar index?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I just realized that my replies to my previous question on sparse
arrays went off list due to the way this list server  is set up
(sigh).  It has occurred to me that for my problem, one possible
solution is columnar indexes and that, in a way, partitioned tables in
Postgres might give me somewhat the same capability.

The basic problem I have is that I have some tables that are
potentially very long (100,000's to millions of rows) and very skinny,
essentially just a FK to a parent table and about 12 bits, maybe less,
of actual data.  Now if I use a traditional table an int for FK is as
large or even larger than the data itself which is why I had been
looking at a sparse array (in which case I don't even need 12 bits for
the data, since part of the information is positional in nature).
However, building the routines to manipulate the arrays could be
painful and it's been suggested that their performance will be
non-optimal.

One alternative might be to use partitioned tables. If I partition the
tables in such a way that the table name can be associated directly to
the FK then I no longer need to actually store the FK in the table,
and I end up with maybe a total of 12 bits of data in each row.
Normally, I'd shy away from such a scheme since it strikes me as
potentially problematic:

1) you've got to join through DB metadata back to the actual data in question;

2) some process has to have create table auths if any of this is to be automated

My bias against 1) might just be because I come from an Oracle
background and it seems that in Postgres this is not as cumbersome or
as frowned on as it might be elsewhere?  The second issue seems
problematic no matter how I look at it.  However, I might be willing
to live with it for this particular problem, particulary if I can talk
myself into believing that I'm building a proxy for columnar indexes
:-)....

So I guess two sets of questions:

1) In general how comfortable are members of the Postgres community in
using table names for partitioned tables where the table name itself
conveys some indirect join relationship?  It seems there is no extra
performance penalty in joining back to the system tables to do a look
up with Postgres so this is perhaps more of a best practices question
(or even a religious issue for relational purists) than anything
else...

2) If you're comfortable with the idea, how far would you go in
embracing it?  In particular, would you add code to Postgres to hide
the fact that you are joining via table name? Would you go as far as
to do it at the system level or would you stick to just wrapping it in
some functions (in which case the create table privilege is still
needed)?

-- 
Peter Hunsberger

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux