Re: best db schema for time series data?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



vindex+lists-pgsql-performance@xxxxxxxxxxx (Louis-David Mitterrand)
writes:
> I have to collect lots of prices from web sites and keep track of their
> changes. What is the best option?
>
> 1) one 'price' row per price change:
>
> 	create table price (
> 		id_price primary key,
> 		id_product integer references product,
> 		price integer
> 	);
>
> 2) a single 'price' row containing all the changes:
>
> 	create table price (
> 		id_price primary key,
> 		id_product integer references product,
> 		price integer[] -- prices are 'pushed' on this array as they change
> 	);
>
> Which is bound to give the best performance, knowing I will often need
> to access the latest and next-to-latest prices?

I'd definitely bias towards #1, but with a bit of a change...

create table product (
  id_product serial primary key
);

create table price (
   id_product integer references product,
   as_at timestamptz default now(),
   primary key (id_product, as_at),
   price integer
);

The query to get the last 5 prices for a product should be
splendidly efficient:

   select price, as_at from price
    where id_product = 17
    order by as_at desc limit 5;

(That'll use the PK index perfectly nicely.)

If you needed higher performance, for "latest price," then I'd add a
secondary table, and use triggers to copy latest price into place:

  create table latest_prices (
     id_product integer primary key references product,
     price integer
  );

create or replace function capture_latest_price () returns trigger as $$
declare
begin
	delete from latest_prices where id_product = NEW.id_product;
	insert into latest_prices (id_product,price) values
	   (NEW.id_product, NEW.price);
	return NEW;
end
$$ language plpgsql;

create trigger price_capture after insert on price execute procedure capture_latest_price();

This captures *just* the latest price for each product.  (There's a bit
of race condition - if there are two concurrent price updates, one will
fail, which wouldn't happen without this trigger in place.)
--
"... Turns   out that JPG  was in  fact using his  brain... and   I am
inclined to encourage him  to continue the practice  even if  it isn't
exactly what I  would have done myself."   -- Alan Bawden  (way out of
context)

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux