Hello Everyone,
I am writing to ask help about a use case I have to set up: if anyone can provide me with any suggestions, I would be really grateful.
This is my use case:
I have a table, with a primary key composed of two columns, each one, of course, with a not null constraint.
Each night a bunch of data will be loaded from an external ETL, which is out of my control.
Since I know there might be data quality issues, such as duplicated records or invalid records with null values on not-null columns, I would like to save such records, and only such records, in an auxiliary table.
Currently, I tried two solutions:
1) Trigger on insert: but I could not have the insert into the auxiliary table working, because the constraints violation triggers a rollback
2) Rule on insert, with a "DO INSTEAD" condition, which works, but, in case of invalid records, copies all the row of the transaction in the auxiliary table and not only the invalid ones.
This is what I did:
Let's assume that my table is actually name "mytable", with pk composed of (col1,col2).
The auxiliary table is mytable_failures, which has the same columns of mytable, no constraints and an additional column "fail_reason".
These are the rules:
CREATE or REPLACE RULE insert_mytable_nulls AS
ON INSERT TO mytable
where (length(trim(NEW.col1))=0 OR
length(trim(NEW.col2))=0 OR
NEW.col1 IS NULL OR
NEW.col2 IS NULL)
DO INSTEAD
insert into mytable_failures values(NEW.*,'col1 and col2 cannot be null');
CREATE or REPLACE RULE insert_mytable_pkey AS
ON INSERT TO mytable
where (select true from mytable where col1=NEW.col1 and col2=NEW.col2)
DO INSTEAD
insert into mytable_failures values(NEW.*,'Primary Key violation');
CREATE or REPLACE RULE insert_mytable_nulls AS
ON INSERT TO mytable
where (length(trim(NEW.col1))=0 OR
length(trim(NEW.col2))=0 OR
NEW.col1 IS NULL OR
NEW.col2 IS NULL)
DO INSTEAD
insert into mytable_failures values(NEW.*,'col1 and col2 cannot be null');
CREATE or REPLACE RULE insert_mytable_pkey AS
ON INSERT TO mytable
where (select true from mytable where col1=NEW.col1 and col2=NEW.col2)
DO INSTEAD
insert into mytable_failures values(NEW.*,'Primary Key violation');
If I execute the following transaction batch:
INSERT INTO mytable(
col1, col2, col3, col3, last_update)
VALUES ('UK', 'FB00004', 'en', now());
INSERT INTO mytable(
col1, col2, col3, col3, last_update)
VALUES ('UK', 'FB00005', 'en', now());
INSERT INTO mytable(
col1, col2, col3, col3, last_update)
VALUES ('UK', 'FB00004', 'en', now());
col1, col2, col3, col3, last_update)
VALUES ('UK', 'FB00004', 'en', now());
INSERT INTO mytable(
col1, col2, col3, col3, last_update)
VALUES ('UK', 'FB00005', 'en', now());
INSERT INTO mytable(
col1, col2, col3, col3, last_update)
VALUES ('UK', 'FB00004', 'en', now());
1) the first two rows are correctly inserted in mytable and the third discarded
2) incorrectly, all the three rows in the mytable_failures.
Thank you for all the help you can provide
Mimo