On Aug 7, 2007, at 6:13 PM, Mark Makarowsky wrote:
Can you provide more detail on what you mean by your
two suggestions below:
Yeah, I've used "vertical partitioning" very
successfully in the past, though I've never done it
for just a single field. I'll typically leave the few
most common fields in the "main" table and pull
everything else into a second table.
Vertical partitioning is where you split up your table on disk by
columns, i.e on the vertical lines. He quoted it because Postgres
doesn't actually support it transparently but you can always fake it
by splitting up your table. For example, given the following table
wherein column bar gets updated a lot but the others don't:
create table foo (
id int not null,
bar int,
baz int,
primary key (id)
);
You could split it up like so:
create table foo_a (
id int,
baz int,
primary key (id)
);
create table foo_b (
foo_id int,
bar int,
foreign key foo_a_id (foo_id) references foo_a (id)
);
The reason you'd ever want to do this is that when Postgres goes to
update a row what it actually does is inserts a new row with the new
value(s) that you changed and marks the old one as deleted. So, if
you have a wide table and frequently update only certain columns,
you'll take a performance hit as you're having to re-write a lot of
static values.
I should mention that if you can handle splitting the
update into multiple transactions, that will help a
lot since it means you won't be doubling the size of
the table.
As I mentioned above, when you do an update you're actually inserting
a new row and deleting the old one. That deleted row is still
considered part of the table (for reasons of concurrency, read up on
the concurrency chapter in the manual for the details) and once it is
no longer visible by any live transactions can be re-used by future
inserts. So, if you update one column on every row of a one million
row table all at once, you have to allocate and write out one million
new rows. But, if you do the update a quarter million at a time, the
last three updates would be able to re-use many of the rows deleted
in earlier updates.
Erik Jones
Software Developer | Emma®
erik@xxxxxxxxxx
800.595.4401 or 615.292.5888
615.292.0777 (fax)
Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com
---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@xxxxxxxxxxxxxx so that your
message can get through to the mailing list cleanly