Re: optimizer behavior in the case of highly updated tables

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Mark Rostron wrote:

- It was necessary to take a site outage and perform a “vacuum full analyze” on the table

- Following this, the query plan reverted to the more efficient btree lookup

Clearly, the garbage buildup resulting from transaction activity on the table is the villain here.

- Is it possible to calculate expected space usage given row count and average row size

- At what point might the ratio of “expected”/”actual” space usage be able to indicate the need to perform “full vacuum”, or similar maintenance


I think you're right to focus on this part, because with your usage pattern--deleting all old data constantly--you have to get this under control in order for the query planner to do the right thing here.

Start by reading http://wiki.postgresql.org/wiki/VACUUM_FULL

What you probably should have done in order to return to sane query plans was to run CLUSTER instead of VACUUM FULL. Site would have been down less time, and you actually made some future problems a bit worse by screwing your indexes up some using the problematic FULL.

As for measuring what's going wrong here, what you want to do is run this around once a week during a slow period and save the output for analysis:

VACUUM VERBOSE;

This will give you a bunch of statistics about space usage, including a summary at the end that will tell if there's a serious problem you might already be running into (running out of space in the free space map). Combine that with a look at pg_stat_user_tables and you should have a decent initial statistical picture of what's going on with your data.

There are two large scale things you may need to consider if volume on your site expands in the future:

1) Deleting the historical data in smaller chunks and doing an immediate VACUUM afterwards. Perhaps as often as daily. This keeps the amount of dead space VACUUM has to clean up as small as possible, at the expensive of higher site overhead.

2) Start partitioning your tables. This allows you to just DROP old partitions rather than deleting rows from a single master table. It can make this whole class of problem go away.

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@xxxxxxxxxxxxxxx   www.2ndQuadrant.us


--
Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux