Re: Update table performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Aug 7, 2007, at 6:13 PM, Mark Makarowsky wrote:

Can you provide more detail on what you mean by your
two suggestions below:

Yeah, I've used "vertical partitioning" very
successfully in the past, though I've never done it
for just a single field. I'll typically leave the few
most common fields in the "main" table and pull
everything else into a second table.

Vertical partitioning is where you split up your table on disk by columns, i.e on the vertical lines. He quoted it because Postgres doesn't actually support it transparently but you can always fake it by splitting up your table. For example, given the following table wherein column bar gets updated a lot but the others don't:

create table foo (
id	int 	not null,
bar	int,
baz 	int,

primary key (id)
);

You could split it up like so:

create table foo_a (
id 	int,
baz	int,

primary key (id)
);

create table foo_b (
foo_id	int,
bar		int,

foreign key foo_a_id (foo_id) references foo_a (id)
);

The reason you'd ever want to do this is that when Postgres goes to update a row what it actually does is inserts a new row with the new value(s) that you changed and marks the old one as deleted. So, if you have a wide table and frequently update only certain columns, you'll take a performance hit as you're having to re-write a lot of static values.


I should mention that if you can handle splitting the
update into multiple transactions, that will help a
lot since it means you won't be doubling the size of
the table.

As I mentioned above, when you do an update you're actually inserting a new row and deleting the old one. That deleted row is still considered part of the table (for reasons of concurrency, read up on the concurrency chapter in the manual for the details) and once it is no longer visible by any live transactions can be re-used by future inserts. So, if you update one column on every row of a one million row table all at once, you have to allocate and write out one million new rows. But, if you do the update a quarter million at a time, the last three updates would be able to re-use many of the rows deleted in earlier updates.

Erik Jones

Software Developer | Emma®
erik@xxxxxxxxxx
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com



---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to majordomo@xxxxxxxxxxxxxx so that your
      message can get through to the mailing list cleanly


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux