I have a table that has over 100K rows of GIS data, including a raster and an insertdatetime timestamp columns. This table is continually loaded with data with processes on the back side querying the data and populating other tables depending
on characteristics of the data. Today a row is read, processed, then deleted. Vacuums occur frequently and are quite time consuming. I figured a scheme of partitioning the table into 7 child tables, one for each day of the week, the day derived from the ‘insertdatetime’ value. Thus, there is the master, ‘incoming_grid’, and 7 children, ‘incoming_grid_sun’, ‘incoming_grid_mon’,
‘incoming_grid_tue’, etc. A job would be kicked off each night after midnight that could then skip the current and the previous days and truncate the tables for the remaining 5 days, speeding the whole process up since there would be no deletes or frequent
vacuums. I even figured on creating a separate tablespace for the child tables. Here’s my DDLs -
CREATE TABLE incoming_grid ( rid integer NOT NULL DEFAULT nextval('incoming_grid_rid_seq'::regclass), -- record id rast raster, model character varying(80) NOT NULL, parameter character varying(80) NOT NULL, forecast bigint NOT NULL, level character varying(128) NOT NULL, insertdatetime timestamp without time zone DEFAULT (now())::timestamp without time zone,
rundatetime timestamp without time zone NOT NULL, CONSTRAINT incoming_grid_pkey PRIMARY KEY (forecast, parameter, level, model, rundatetime) ); CREATE INDEX "incoming_grid_Index1" ON incoming_grid USING btree (parameter, forecast, level, rundatetime); CREATE TABLE incoming_grid_sun ( CHECK (to_char(insertdatetime, 'dy') = 'sun') ) INHERITS (incoming_grid); CREATE INDEX "incoming_grid_sun_Index1" ON incoming_grid_sun USING btree (parameter , forecast, level, rundatetime); … CREATE TABLE incoming_grid_sat ( CHECK (to_char(insertdatetime, 'dy') = 'sat') ) INHERITS (incoming_grid); CREATE INDEX "incoming_grid_sat_Index1" ON incoming_grid_sat USING btree (parameter , forecast, level, rundatetime); CREATE TRIGGER incoming_grid_load_trigger BEFORE INSERT OR UPDATE ON incoming_grid FOR EACH ROW EXECUTE PROCEDURE incoming_grid_load(); CREATE OR REPLACE FUNCTION incoming_grid_load() RETURNS trigger AS $BODY$ DECLARE schema TEXT='children'; tablename TEXT; query TEXT; BEGIN tablename = 'incoming_grid_' || to_char(NEW.insertdatetime, 'dy'); IF TG_OP = 'INSERT' THEN query = 'INSERT INTO ' || tablename || ' SELECT $1.*;'; EXECUTE(query) USING NEW; END IF; RETURN NULL; END; $BODY$ LANGUAGE plpgsql VOLATILE STRICT COST 100; So far, so good. I loaded the master with 100 rows and they all went to their correct child table. My problem now is that SELECTs scan all child tables – the CHECK constraint doesn’t factor in. I tried the following to create another index on the master, but it didn’t work, either – all partitions are scanned. Any ideas on how I can
resolve this? CREATE FUNCTION custom_to_char(timestamp) RETURNS text AS $$ select to_char($1, 'dy'); $$ LANGUAGE sql immutable; CREATE INDEX "incoming_grid_Index2" ON incoming_grid USING btree (custom_to_char(insertdatetime)); Thanks, Steve Erickson NOTICE: This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. |