designing time dimension for star schema

Mark Wong <markwkm@xxxxxxxxx> · Mon, 10 Feb 2014 08:45:28 -0800

Hello everybody,

I was wondering if anyone had any experiences they can share when
designing the time dimension for a star schema and the like.  I'm
curious about how well it would work to use a timestamp for the
attribute key, as opposed to a surrogate key, and populating the time
dimension with triggers on insert to the fact tables.  This is
something that would have data streaming in (as oppose to bulk
loading) and I think we want time granularity to the minute.

A simplified example:

-- Time dimension
CREATE TABLE time (
    datetime TIMESTAMP WITH TIME ZONE NOT NULL,
    day_of_week SMALLINT NOT NULL
);
CREATE UNIQUE INDEX ON time (datetime);

-- Fact
CREATE TABLE fact(
    datetime TIMESTAMP WITH TIME ZONE NOT NULL,
    FOREIGN KEY (datetime) REFERENCES time(datetime)
);

-- Function to populate the time dimension
CREATE OR REPLACE FUNCTION decompose_timestamp() RETURNS TRIGGER AS $$
BEGIN
    NEW.datetime = date_trunc('minutes', NEW.datetime);
    INSERT INTO time (datetime, day_of_week)
    VALUES (NEW.datetime, date_part('dow', NEW.datetime));
    RETURN NEW;
EXCEPTION
    WHEN unique_violation THEN
        -- Do nothing if the timestamp already exists in the dimension table.
        RETURN new;
END; $$
LANGUAGE 'plpgsql';

CREATE TRIGGER populate_time BEFORE INSERT
ON fact FOR EACH ROW
EXECUTE PROCEDURE decompose_timestamp();

Regards,
Mark

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general