> I have not had chance to fully go through all of below. Some
questions/suggestions:
>
> 1) Thanks for the formatted queries. If I could make a suggestion,
when aliasing could you include AS. It would make finding what l.*
refers to easier for those of us with old eyes:)
>
Yes, of course, sorry :-)
> 2) t4e_contractor_id is in the shiny_adm.contractor_access table?
> If not where?
Yes, sorry again, it is there
>
> 3) What is the schema for shiny_adm.contractor_access?
> In particular what indexes are on it?
>
shiny_adm.contractor_access looks like this:
CREATE TABLE shiny_adm.contractor_access
(
machine_key text,
t4e_contractor_id text,
active integer DEFAULT 1,
id serial NOT NULL,
CONSTRAINT contractor_access_pkey PRIMARY KEY (id),
CONSTRAINT contractor_unique UNIQUE (machine_key, t4e_contractor_id),
CONSTRAINT co_check_t4e_co_email CHECK
(utils.verify_email(t4e_contractor_id))
)
CREATE INDEX idx_contractor
ON shiny_adm.contractor_access
USING btree
(t4e_contractor_id COLLATE pg_catalog."default");
CREATE INDEX idx_contractor_mk
ON shiny_adm.contractor_access
USING btree
(machine_key COLLATE pg_catalog."default");
I tried to format the below a little better with AS and some more
consistent indents.
I also, in the first query, changed the where clause to filter on
machine_key in table contractor _access. Just to illustrate the problem
better.
Both queries filter on the same table which is joined the same way. But
in the second example the where clause is not pushed to the subquery l
Thanks a lot for looking into it
Nicklas
Query 1:
EXPLAIN ANALYZE
SELECT
l.machine_key,
o.object_name,
o.sub_object_name,
o.object_user_id,
o.sub_object_user_id,
o.start_date AS object_start_date,
s.species_group_name,
p.product_group_name,
l.m3_sub AS volume_m3sub,
l.number_of_logs,
mi.basemachine_manufacturer,
mi.basemachine_model
FROM
(
SELECT
hl.contractor_id,
hl.machine_key,
hl.operator_key,
hl.object_key,
hl.sub_object_key,
date(hl.harvest_date) AS harvest_date,
hl.species_group_key,
hl.product_key,
sum(hl.m3_sub) AS m3_sub,
count(*) AS number_of_logs
FROM
version_union_tables_r02.harvester_logs hl
GROUP BY
hl.machine_key, hl.contractor_id, hl.operator_key,
hl.object_key, hl.sub_object_key, (date(hl.harvest_date)),
hl.species_group_key, hl.product_key
) AS l
LEFT JOIN version_union_tables_r02.machine_info AS mi ON
l.machine_key::text = mi.machine_key::text
LEFT JOIN version_union_tables_r02.objects AS o ON
l.machine_key::text = o.machine_key::text AND l.object_key =
o.object_key AND l.sub_object_key = o.sub_object_key
LEFT JOIN version_union_tables_r02.products AS p ON
l.machine_key::text = p.machine_key::text AND l.product_key = p.product_key
LEFT JOIN version_union_tables_r02.species AS s ON
l.machine_key::text = s.machine_key::text AND l.species_group_key =
s.species_group_key
join shiny_adm.contractor_access AS ci on l.machine_key=ci.machine_key
where ci.machine_key = '887655635442600';
Resulting in this query plan:
Nested Loop (cost=61865.25..65302.20 rows=22624 width=122) (actual
time=27.801..29.225 rows=250 loops=1)
-> Seq Scan on contractor_access ci (cost=0.00..1.01 rows=1
width=19) (actual time=0.005..0.006 rows=1 loops=1)
Filter: (machine_key = '887655635442600'::text)
-> Hash Left Join (cost=61865.25..65074.95 rows=22624 width=122)
(actual time=27.794..29.070 rows=250 loops=1)
Hash Cond: (((hl.machine_key)::text = (s.machine_key)::text)
AND (hl.species_group_key = s.species_group_key))
-> Hash Left Join (cost=61854.55..64263.92 rows=22624
width=120) (actual time=27.771..28.851 rows=250 loops=1)
Hash Cond: (((hl.machine_key)::text =
(p.machine_key)::text) AND (hl.product_key = p.product_key))
-> Hash Left Join (cost=61815.97..63145.14 rows=22624
width=118) (actual time=27.736..28.628 rows=250 loops=1)
Hash Cond: (((hl.machine_key)::text =
(o.machine_key)::text) AND (hl.object_key = o.object_key) AND
(hl.sub_object_key = o.sub_object_key))
-> Hash Left Join (cost=61799.78..62619.90
rows=22624 width=65) (actual time=27.709..28.416 rows=250 loops=1)
Hash Cond: ((hl.machine_key)::text =
(mi.machine_key)::text)
-> HashAggregate (cost=61796.99..62079.79
rows=22624 width=69) (actual time=27.677..28.217 rows=250 loops=1)
Group Key: hl.machine_key,
hl.contractor_id, hl.operator_key, hl.object_key, hl.sub_object_key,
date(hl.harvest_date), hl.species_group_key, hl.product_key
-> Bitmap Heap Scan on harvester_logs
hl (cost=570.14..61224.14 rows=22914 width=61) (actual
time=1.040..12.977 rows=24151 loops=1)
Recheck Cond:
((machine_key)::text = '887655635442600'::text)
Heap Blocks: exact=538
-> Bitmap Index Scan on
version_union_tables_r02_harvester_logs_machine_key (cost=0.00..564.41
rows=22914 width=0) (actual time=0.996..0.997 rows=24151 loops=1)
Index Cond:
((machine_key)::text = '887655635442600'::text)
-> Hash (cost=2.77..2.77 rows=1 width=38)
(actual time=0.018..0.018 rows=1 loops=1)
Buckets: 1024 Batches: 1 Memory
Usage: 9kB
-> Seq Scan on machine_info mi
(cost=0.00..2.77 rows=1 width=38) (actual time=0.013..0.014 rows=1 loops=1)
Filter: ((machine_key)::text =
'887655635442600'::text)
Rows Removed by Filter: 61
-> Hash (cost=16.12..16.12 rows=4 width=84)
(actual time=0.020..0.020 rows=3 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Bitmap Heap Scan on objects o
(cost=4.31..16.12 rows=4 width=84) (actual time=0.015..0.016 rows=3 loops=1)
Recheck Cond: ((machine_key)::text =
'887655635442600'::text)
Heap Blocks: exact=1
-> Bitmap Index Scan on
version_union_tables_r02_objects_machine_key (cost=0.00..4.31 rows=4
width=0) (actual time=0.011..0.012 rows=3 loops=1)
Index Cond: ((machine_key)::text
= '887655635442600'::text)
-> Hash (cost=38.19..38.19 rows=26 width=35) (actual
time=0.030..0.031 rows=26 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 10kB
-> Bitmap Heap Scan on products p
(cost=4.48..38.19 rows=26 width=35) (actual time=0.010..0.019 rows=26
loops=1)
Recheck Cond: ((machine_key)::text =
'887655635442600'::text)
Heap Blocks: exact=1
-> Bitmap Index Scan on
version_union_tables_r02_products_machine_key (cost=0.00..4.47 rows=26
width=0) (actual time=0.006..0.006 rows=26 loops=1)
Index Cond: ((machine_key)::text =
'887655635442600'::text)
-> Hash (cost=10.52..10.52 rows=12 width=37) (actual
time=0.018..0.018 rows=12 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Bitmap Heap Scan on species s (cost=4.37..10.52
rows=12 width=37) (actual time=0.008..0.012 rows=12 loops=1)
Recheck Cond: ((machine_key)::text =
'887655635442600'::text)
Heap Blocks: exact=1
-> Bitmap Index Scan on
version_union_tables_r02_species_machine_key (cost=0.00..4.36 rows=12
width=0) (actual time=0.004..0.005 rows=12 loops=1)
Index Cond: ((machine_key)::text =
'887655635442600'::text)
Planning time: 0.376 ms
Execution time: 29.435 ms
Next query, the slow one that calculates the whole dataset:
EXPLAIN ANALYZE
SELECT
l.machine_key,
o.object_name,
o.sub_object_name,
o.object_user_id,
o.sub_object_user_id,
o.start_date AS object_start_date,
s.species_group_name,
p.product_group_name,
l.m3_sub AS volume_m3sub,
l.number_of_logs,
mi.basemachine_manufacturer,
mi.basemachine_model
FROM shiny_adm.contractor_access ci join
(
SELECT
hl.contractor_id,
hl.machine_key,
hl.operator_key,
hl.object_key,
hl.sub_object_key,
date(hl.harvest_date) AS harvest_date,
hl.species_group_key,
hl.product_key,
sum(hl.m3_sub) AS m3_sub,
count(*) AS number_of_logs
FROM
version_union_tables_r02.harvester_logs AS hl
GROUP BY
hl.machine_key, hl.contractor_id, hl.operator_key,
hl.object_key, hl.sub_object_key, (date(hl.harvest_date)),
hl.species_group_key, hl.product_key
) AS l on l.machine_key=ci.machine_key
LEFT JOIN version_union_tables_r02.machine_info AS mi ON
l.machine_key::text = mi.machine_key::text
LEFT JOIN version_union_tables_r02.objects AS o ON
l.machine_key::text = o.machine_key::text AND l.object_key =
o.object_key AND l.sub_object_key = o.sub_object_key
LEFT JOIN version_union_tables_r02.products AS p ON
l.machine_key::text = p.machine_key::text AND l.product_key = p.product_key
LEFT JOIN version_union_tables_r02.species AS s ON
l.machine_key::text = s.machine_key::text AND l.species_group_key =
s.species_group_key
WHERE t4e_contractor_id = 'nicklas.aven@xxxxxxxxxxxxx';
results in this query plan:
Hash Left Join (cost=1780026.09..2023556.15 rows=4044 width=122)
(actual time=16336.200..16366.486 rows=250 loops=1)
Hash Cond: (((hl.machine_key)::text = (o.machine_key)::text) AND
(hl.object_key = o.object_key) AND (hl.sub_object_key = o.sub_object_key))
-> Merge Left Join (cost=1779946.65..2023340.22 rows=4044 width=69)
(actual time=16334.747..16364.834 rows=250 loops=1)
Merge Cond: ((hl.machine_key)::text = (s.machine_key)::text)
Join Filter: (hl.species_group_key = s.species_group_key)
Rows Removed by Join Filter: 2750
-> Merge Left Join (cost=1779915.71..2023136.40 rows=4044
width=67) (actual time=16334.145..16362.241 rows=250 loops=1)
Merge Cond: ((hl.machine_key)::text = (p.machine_key)::text)
Join Filter: (hl.product_key = p.product_key)
Rows Removed by Join Filter: 6250
-> Merge Left Join (cost=1779788.20..2022471.20
rows=4044 width=65) (actual time=16332.364..16356.313 rows=250 loops=1)
Merge Cond: ((hl.machine_key)::text =
(mi.machine_key)::text)
-> Merge Join (cost=1779783.74..2022437.81
rows=4044 width=48) (actual time=16332.238..16355.855 rows=250 loops=1)
Merge Cond: ((hl.machine_key)::text =
ci.machine_key)
-> GroupAggregate
(cost=1779782.72..2012287.44 rows=808712 width=69) (actual
time=15552.813..16354.893 rows=2683 loops=1)
Group Key: hl.machine_key,
hl.contractor_id, hl.operator_key, hl.object_key, hl.sub_object_key,
(date(hl.harvest_date)), hl.species_group_key, hl.product_key
-> Sort (cost=1779782.72..1800000.52
rows=8087121 width=61) (actual time=15552.795..15959.066 rows=942552
loops=1)
Sort Key: hl.machine_key,
hl.contractor_id, hl.operator_key, hl.object_key, hl.sub_object_key,
(date(hl.harvest_date)), hl.species_group_key, hl.product_key
Sort Method: external merge Disk:
543456kB
-> Seq Scan on harvester_logs
hl (cost=0.00..243781.01 rows=8087121 width=61) (actual
time=0.008..3221.502 rows=8084464 loops=1)
-> Sort (cost=1.02..1.03 rows=1 width=19)
(actual time=0.018..0.019 rows=1 loops=1)
Sort Key: ci.machine_key
Sort Method: quicksort Memory: 25kB
-> Seq Scan on contractor_access ci
(cost=0.00..1.01 rows=1 width=19) (actual time=0.008..0.009 rows=1 loops=1)
Filter: (t4e_contractor_id =
'nicklas.aven@xxxxxxxxxxxxx'::text)
-> Sort (cost=4.47..4.62 rows=62 width=38)
(actual time=0.112..0.197 rows=266 loops=1)
Sort Key: mi.machine_key
Sort Method: quicksort Memory: 30kB
-> Seq Scan on machine_info mi
(cost=0.00..2.62 rows=62 width=38) (actual time=0.008..0.032 rows=62
loops=1)
-> Sort (cost=127.50..131.23 rows=1491 width=35)
(actual time=1.353..3.404 rows=7204 loops=1)
Sort Key: p.machine_key
Sort Method: quicksort Memory: 175kB
-> Seq Scan on products p (cost=0.00..48.91
rows=1491 width=35) (actual time=0.005..0.556 rows=1491 loops=1)
-> Sort (cost=30.94..32.09 rows=460 width=37) (actual
time=0.436..1.363 rows=3259 loops=1)
Sort Key: s.machine_key
Sort Method: quicksort Memory: 65kB
-> Seq Scan on species s (cost=0.00..10.60 rows=460
width=37) (actual time=0.004..0.161 rows=460 loops=1)
-> Hash (cost=49.25..49.25 rows=1725 width=84) (actual
time=1.444..1.444 rows=1690 loops=1)
Buckets: 2048 Batches: 1 Memory Usage: 190kB
-> Seq Scan on objects o (cost=0.00..49.25 rows=1725
width=84) (actual time=0.004..0.656 rows=1725 loops=1)
Planning time: 0.653 ms
Execution time: 16428.966 ms