On 17/12/2010, at 9:20 PM, Pierre C wrote: > >> fc=# explain analyse select collection, period, tariff, sum(bytesSent), sum(bytesReceived), sum(packets), max(sample), (starttime / 3600) * 3600 as startchunk from sample_20101001 where starttime between 1287493200 and 1290171599 and collection=128 and ip = '10.9.125.207' group by startchunk, tariff, collection, period; > > If CLUSTER locks bother you, and you don't do UPDATEs, you might consider doing something like this : > > - accumulate the rows in a "recent" table > - every hour, INSERT INTO archive SELECT * FROM recent ORDER BY (your cluster fields) > - DELETE FROM recent the rows you just inserted > - VACUUM recent > > The cluster in your archive table will not be perfect but at least all rows from 1 source in 1 hour will be stored close together. But clustering doesn't need to be perfect either, if you get 100x better locality, that's already good ! That's a really decent idea and can slot in perfectly well with how the application already works! We have existing DBAO code that handles monthly tables; it'll happily pop data in to a recent table.... In fact we can probably tolerate having a "today" table. Thanks! --Royce -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance