Re: Mass updates on a large table

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8/10/07, Mark Steben <msteben@xxxxxxxxxxxxxxx> wrote:
>
> Good afternoon,
>
> I am attempting an update on two new date field columns on a 17 million row
> table.  Every row gets updated.
>
> The update statement is a simple one:
>
>   UPDATE EMAILRCPTS SET ID = ID
>
>  And the update of the new date fields themselves occurs as the result of a
> before trigger.
>
> The update took 3 days, 10 hours to complete on the testing box.

That's quite a while for only 17 million rows.  Are these rows
particularly wide?
Is it possible to do it by groups with a vacuum in between each group?
 That would keep the bloat down.

You don't mention your vacuuming strategy.  That might affect performance here.

Also, are there any FKs to / from this table?

> To minimize the impact of checkpoints.   The SHARED_BUFFERS parameter has
> been bumped up to 140000 on a 20meg RAM box.

I assume you meant 20Gig box.

Under 7.4 larger shared_buffers may not be a good thing.  that's a
very large shared buffer setting for 7.4 to handle.

> There are about 9 indexes on this table although none of them reference the
> date fields so since there are no inserts I don't think they would have an
> impact on the update  (I've been wrong before though)

Doesn't matter.  PostgreSQL's implementation of MVCC means that each
update results in a new row, and therefore each index has to be
updated for each row updated.

> Would an update statement referencing the date fields work faster than a
> trigger?

Possibly.

>  Do you have any other suggestions to speed this up?
> We are at Postgres 7.4.5.

Upgrade to a modern version?  7.4 is getting old fast, and 7.4.5 has a
LOT of bugs that have been fixed in later versions. It's up to like
7.4.17 so you're missing a LOT of updates just in the branch you're
in.  But upgrading to 8.2.4 would definitely be a help.

> We simply cannot afford this table to be down for 3+ days during a
> production update.  The production box is a 32meg RAM box.

I would question the business process that requires an entire 17
million row table be updated.

Also, posting your schema and your triggers might help a bit as well.

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
       subscribe-nomail command to majordomo@xxxxxxxxxxxxxx so that your
       message can get through to the mailing list cleanly

[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux