Search Postgresql Archives

Re: Vacuuming strategy

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi:

On Wed, Apr 30, 2014 at 1:59 AM, Elanchezhiyan Elango
<elanelango@xxxxxxxxx> wrote:
> I need help on deciding my vacuuming strategy. I need to know if I ever need
> to do 'vacuum full' for my tables.
>
> Tables1: Following is the query patterns on 4 high traffic table in my
> database:
> 1. Every 5 minutes about 50000 rows in the table are updated. And for a
> given clock hour the same 50000 rows are updated again and again.
> 2. When a new hour begins, another 50000 rows get inserted and they get
> updated every 5 minutes.
> 3. Every night a days worth of old stats are deleted. So this would be 24 *
> 50000 = 1.2M records that get deleted every night.
.......

Not a vacuuming strategy, but have you considered partitioning?

I mean. Your pattern seems to be insert rows, update during the hour,
delete daily. If you have a clear timestamp key, your problem seems to
be well suited for it, specially given you seem to have an hourly
process ( insert in step 2 ), and a daily process ( deletion in step 3
). I can see two potential uses for partitions here.

One is a classic, lets say a partition per day. In step 3 you drop the
old partition and create the one for the next day, this is fast, and
autovacuum will maintain the current table tidy enough. You could even
complement it with full vacuuming the newly inserted partition.
Something like, if you want to keep K days shortly after the  start of
day D ( not at exact midnight to avoid races ) you drop the partition
for day D-K, create the one for D+1 ( the one for D was created
beforehand, on D-1 to avoid races ) set it for adequate autovacuuming
and, if you want, do a vacuum full and reindex for day D-1 ( which is
now constant ). If you need to keep hundreds of days you could go for
weekly or monthly partition, if you keep few you could consider
hourly.

Another one could be to make, say, daily partitions, but no rules /
triggers to redirect inserts, just the constraint and inheritance
rules to get proper selects, let the hourly insert ( and subsequent
updates ) go to the main table.
Then, on the hourly job, you can move all the now constant past-hour
rows to the appropiate partition ( which just an "insert into part
delete from main where adequate_range returning *" ). The partitions
will not need reindex or vacuuming as they will become append-only,
and the main table will have between 1 and two hour of records, and
would probably be adequately managed with autovacuum, which given the
table will be small should be fast.
Old partitions could be dropped when unneeded as above.

    Francisco Olarte.


-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux