Search Postgresql Archives

Issues with patitionning and triggers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I have data warehousing DB 2 fairly big tables : one contains about 200 
million rows and the other one contains about 4 billion rows.  Some queries 
are now taking way too long to run (> 13 hours).  I need to get these queries 
to run in an hour or so.  The slowdown was gradual, but I eventually hit a 
wall, when the planner stopped using indexes.

All of this was done on PostgreSQL 9.2.0 64-bit compiled from the official 
source.  Significant changes in postgresql.conf :

shared_buffers = 8GB
work_mem = 8GB
maintenance_work_mem = 8GB
max_stack_depth = 2MB

Here is the information about the big tables :

SELECT
   nspname || '.' || relname AS relation,
   pg_stat_get_live_tuples(C.oid) AS live_tuples,
   pg_size_pretty(pg_relation_size(C.oid, 'main')) AS relation_size,
   pg_size_pretty(pg_indexes_size(C.oid)) AS indexes_size,
   pg_size_pretty(pg_total_relation_size(C.oid)) AS total_size
FROM pg_class C LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
WHERE
   nspname NOT IN ('pg_catalog', 'information_schema') AND
   C.relkind <> 'i' AND
   nspname !~ '^pg_toast' AND
   relname IN ('obs', 'forecast');

    relation     | live_tuples | relation_size | indexes_size | total_size
-----------------+-------------+---------------+--------------+------------
 public.obs      |   193235914 | 18 GB         | 15 GB        | 33 GB
 public.forecast |  3914247064 | 425 GB        | 148 GB       | 573 GB

>From what I read, the planner probably stopped using indexes since they are so 
big compared to the system RAM (64 GB).  I therefore tried partitioning the 
obs table.  The forecast table is the biggest issue, but I wanted to be able 
to try this out in a single day before tackling the real monster.

Here is the structure of the parent table :

                 Table "public.observation"
      Column      |            Type             | Modifiers 
------------------+-----------------------------+-----------
 station          | integer                     | not null
 method           | integer                     | not null
 startdate        | timestamp without time zone | not null
 duration         | interval                    | not null
 value            | real                        | not null
 dataset          | integer                     | not null
 modificationdate | timestamp without time zone | not null
Check constraints:
    "observation_check1" CHECK ((startdate + duration) < now())
    "observation_duration_check1" CHECK (duration > '00:00:00'::interval)
Foreign-key constraints:
    "observation_dataset_fkey1" FOREIGN KEY (dataset) REFERENCES dataset(id) 
MATCH FULL
    "observation_method_fkey1" FOREIGN KEY (method) REFERENCES method(id) 
MATCH FULL
    "observation_station_fkey1" FOREIGN KEY (station) REFERENCES station(id) 
MATCH FULL
Triggers:
    trigger_insert_00 BEFORE INSERT ON observation FOR EACH ROW EXECUTE 
PROCEDURE observation_insert_trigger()

INSERT ... RETURNING does not work with partitioned tables, since the trigger 
function on the parent that dispatches new rows to the children tables, must 
return NULL.  If the trigger function on the parent ends with "RETURN NEW", 
INSERT ... RETURNING works, but new rows are duplicated; they are inserted 
both in the parent and child tables.

Is there a way to make INSERT ... RETURNING work without duplicating the rows?

The other issue I'm encountering is that I also have very simple BEFORE UPDATE 
and BEFORE INSERT triggers that set the modification date on every single row 
:

CREATE FUNCTION set_modificationDate() RETURNS TRIGGER AS $$
BEGIN
   NEW.modificationDate := now();
   RETURN NEW;
END;
$$ LANGUAGE 'plpgsql';

The modification date must be updated if any row is modified in any way.  I 
first tried to define the triggers on the parent table.  This worked, but I 
realized that if a queries targets explicitly a child table, it could modify a 
row without the date being updated.  I therefore dropped the triggers on the 
parent table and defined them for every child.  To my great surprise, the 
insert below failed with a message saying that NULLs are not allowed in the 
modificationdate column.

INSERT INTO observation
(dataset, station, method, startdate, duration, value)
VALUES (9, 2, 128, '2014-01-01 00:00:00', '24 hours', 42);

Why isn't the BEFORE INSERT trigger on the child table being executed?

Cheers,
Samuel Gilbert


-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux