On Mon, Feb 10, 2014 at 9:20 AM, CS DBA <cs_dba@xxxxxxxxxxxxxxxxxxx> wrote: > I've done a lot of DSS architecture. A couple of thoughts: > > - in most cases the ETL process figures out the time id's as part of the > preparation and then does bulk loads into the fact tables > I would be very concerned about performance of a trigger that > fired for every row on the fact table > > you mention you want to do data streaming instead of bulk loads, > can you elaborate? We have processes inserting data from log files as they are written. > - When querying a star schema one of the performance features is the > fact that all joins to the dimension tables are performed via a numeric > key, such as: > "select * from fact, time_dim, geo_dim > where fact.time_id = time_dim.time_id..." > > In the case of this being a timestamp I suspect the performance would > take a hit, depending on the size of your fact table and the > scope/volume of your DSS queries this could easily be a show stopper > based on the assumption that the database can do a numeric binary search > much faster than a timestamp search I guess I was hoping the extra 4 bytes from a timestamp, compared to a bigint, wouldn't be too significant yet I didn't consider postgres might do a binary search faster on an integer type than a timestamp. Even with 1 billion rows, but maybe that's wishful thinking. Maybe a regular integer at 4 bytes would be good enough. I would estimate a query would touch up to an order of 1 million rows at a time. Regards, Mark -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general