Hi all
As I understand it, a general rule of thumb is that you should never
create a physical column if the data could be derived from existing
columns. A possible reason for breaking this rule is for performance
reasons.
I have a situation where I am considering breaking the rule, but I am
not experienced enough in SQL to know if my reason is valid. I would
appreciate it if someone could glance at my 'before' and 'after'
scenarios and see if, from a 'gut-feel' point of view, I should proceed.
I have a VIEW constructed as follows -
CREATE VIEW view_name AS
[select statement 1]
UNION ALL
[select statement 2]
etc.
This is one of the select statements. I will give the 'after' scenario
first -
SELECT
'arec' AS tran_type, a.row_id AS tran_row_id,
a.tran_number AS tran_number, a.cust_row_id AS cust_row_id,
CASE
WHEN a.tran_type = 'ar_rec' THEN y.tran_date
WHEN a.tran_type = 'cb_rec' THEN w.tran_date
END AS tran_date,
CASE
WHEN a.tran_type = 'ar_rec' THEN y.text
WHEN a.tran_type = 'cb_rec' THEN w.text
END AS text,
0 - a.arec_cust AS amount_cust,
0 - a.arec_local AS amount_local
FROM ar_rec_subtran a
LEFT JOIN ar_tran_rec_det z ON z.row_id = a.tran_det_row_id
LEFT JOIN ar_tran_rec y ON y.row_id = z.tran_row_id
LEFT JOIN cb_tran_rec_det x ON x.row_id = a.tran_det_row_id
LEFT JOIN cb_tran_rec w ON w.row_id = x.tran_row_id
WHERE
CASE
WHEN a.tran_type = 'ar_rec' THEN y.posted
WHEN a.tran_type = 'cb_rec' THEN w.posted
END = '1'
The two columns a.arec_cust and a.arec_local *can* be derived from other
columns, and in fact that is how it is working at the moment, so here is
the 'before' scenario -
SELECT
'arec' AS tran_type, a.row_id AS tran_row_id,
a.tran_number AS tran_number, a.cust_row_id AS cust_row_id,
CASE
WHEN a.tran_type = 'ar_rec' THEN y.tran_date
WHEN a.tran_type = 'cb_rec' THEN w.tran_date
END AS tran_date,
CASE
WHEN a.tran_type = 'ar_rec' THEN y.text
WHEN a.tran_type = 'cb_rec' THEN w.text
END AS text,
ROUND(0 - (ROUND(a.arec_amount / CASE
WHEN a.tran_type = 'ar_rec' THEN y.tran_exch_rate
WHEN a.tran_type = 'cb_rec' THEN w.tran_exch_rate
END * a.cust_exch_rate, u.scale)), 2) AS amount_cust,
ROUND(0 - (ROUND(a.arec_amount / CASE
WHEN a.tran_type = 'ar_rec' THEN y.tran_exch_rate
WHEN a.tran_type = 'cb_rec' THEN w.tran_exch_rate
END, s.scale)), 2) AS amount_local
FROM ar_rec_subtran a
LEFT JOIN ar_tran_rec_det z ON z.row_id = a.tran_det_row_id
LEFT JOIN ar_tran_rec y ON y.row_id = z.tran_row_id
LEFT JOIN cb_tran_rec_det x ON x.row_id = a.tran_det_row_id
LEFT JOIN cb_tran_rec w ON w.row_id = x.tran_row_id
LEFT JOIN ar_customers v ON v.row_id = a.cust_row_id
LEFT JOIN adm_currencies u ON u.row_id = v.currency_id
LEFT JOIN adm_params t ON t.row_id = 1
LEFT JOIN adm_currencies s ON s.row_id = t.local_curr_id
WHERE
CASE
WHEN a.tran_type = 'ar_rec' THEN y.posted
WHEN a.tran_type = 'cb_rec' THEN w.posted
END = '1'
As you can see, complexity has increased and there are four additional
JOINs.
I am expecting the VIEW to be used extensively for query purposes, and
my gut-feel says that the second one is likely to lead to performance
problems in a system with a lot of data and a lot of users.
I am not looking for an answer - I know that I should create dummy data
and run some timing tests. I was just wondering if someone more
experienced would wince when they look at the second SELECT, or if they
would shrug and think that it looks fine.
Any input will be appreciated.
Frank Millman