PFC,
thank you.
OK so vmstat says you are IO-bound, this seems logical if the same plan
has widely varying timings...
Let's look at the usual suspects :
- how many dead rows in your tables ? are your tables data, or bloat ?
(check vacuum verbose, etc)
set search_path to firma2,public;
vacuum verbose dok; vacuum verbose rid
INFO: vacuuming "firma2.dok"
INFO: index "dok_pkey" now contains 1235086 row versions in 9454 pages
DETAIL: 100 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.16s/0.38u sec elapsed 0.77 sec.
INFO: index "dok_dokumnr_idx" now contains 1235086 row versions in 9454
pages
DETAIL: 100 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.14s/0.40u sec elapsed 0.78 sec.
INFO: index "dok_klient_idx" now contains 1235086 row versions in 18147
pages
DETAIL: 887 index row versions were removed.
3265 index pages have been deleted, 3033 are currently reusable.
CPU 0.36s/0.46u sec elapsed 31.87 sec.
INFO: index "dok_krdokumnr_idx" now contains 1235086 row versions in 11387
pages
DETAIL: 119436 index row versions were removed.
1716 index pages have been deleted, 1582 are currently reusable.
CPU 0.47s/0.55u sec elapsed 63.38 sec.
INFO: index "dok_kuupaev_idx" now contains 1235101 row versions in 10766
pages
DETAIL: 119436 index row versions were removed.
659 index pages have been deleted, 625 are currently reusable.
CPU 0.62s/0.53u sec elapsed 40.20 sec.
INFO: index "dok_tasudok_idx" now contains 1235104 row versions in 31348
pages
DETAIL: 119436 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 1.18s/1.08u sec elapsed 118.97 sec.
INFO: index "dok_tasudok_unique_idx" now contains 99 row versions in 97
pages
DETAIL: 98 index row versions were removed.
80 index pages have been deleted, 80 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.48 sec.
INFO: index "dok_tasumata_idx" now contains 1235116 row versions in 11663
pages
DETAIL: 119436 index row versions were removed.
5340 index pages have been deleted, 5131 are currently reusable.
CPU 0.43s/0.56u sec elapsed 53.96 sec.
INFO: index "dok_tellimus_idx" now contains 1235122 row versions in 11442
pages
DETAIL: 119436 index row versions were removed.
1704 index pages have been deleted, 1569 are currently reusable.
CPU 0.45s/0.59u sec elapsed 76.91 sec.
INFO: index "dok_yksus_pattern_idx" now contains 1235143 row versions in
5549 pages
DETAIL: 119436 index row versions were removed.
529 index pages have been deleted, 129 are currently reusable.
CPU 0.19s/0.46u sec elapsed 2.72 sec.
INFO: index "dok_doktyyp" now contains 1235143 row versions in 3899 pages
DETAIL: 119436 index row versions were removed.
188 index pages have been deleted, 13 are currently reusable.
CPU 0.14s/0.44u sec elapsed 1.40 sec.
INFO: index "dok_sihtyksus_pattern_idx" now contains 1235143 row versions
in 5353 pages
DETAIL: 119436 index row versions were removed.
286 index pages have been deleted, 5 are currently reusable.
CPU 0.13s/0.45u sec elapsed 3.01 sec.
INFO: "dok": removed 119436 row versions in 13707 pages
DETAIL: CPU 0.80s/0.37u sec elapsed 14.15 sec.
INFO: "dok": found 119436 removable, 1235085 nonremovable row versions in
171641 pages
DETAIL: 2 dead row versions cannot be removed yet.
There were 1834279 unused item pointers.
0 pages are entirely empty.
CPU 6.56s/6.88u sec elapsed 450.54 sec.
INFO: vacuuming "pg_toast.pg_toast_40595"
INFO: index "pg_toast_40595_index" now contains 0 row versions in 1 pages
DETAIL: 0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "pg_toast_40595": found 0 removable, 0 nonremovable row versions in 0
pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: vacuuming "firma2.rid"
INFO: index "rid_pkey" now contains 3275197 row versions in 13959 pages
DETAIL: 38331 index row versions were removed.
262 index pages have been deleted, 262 are currently reusable.
CPU 0.42s/1.05u sec elapsed 58.56 sec.
INFO: index "rid_dokumnr_idx" now contains 3275200 row versions in 14125
pages
DETAIL: 38331 index row versions were removed.
572 index pages have been deleted, 571 are currently reusable.
CPU 0.49s/1.14u sec elapsed 71.57 sec.
INFO: index "rid_inpdokumnr_idx" now contains 3275200 row versions in 15103
pages
DETAIL: 38331 index row versions were removed.
579 index pages have been deleted, 579 are currently reusable.
CPU 0.66s/1.03u sec elapsed 68.38 sec.
INFO: index "rid_toode_idx" now contains 3275224 row versions in 31094
pages
DETAIL: 38331 index row versions were removed.
2290 index pages have been deleted, 2290 are currently reusable.
CPU 1.39s/1.58u sec elapsed 333.82 sec.
INFO: index "rid_rtellimus_idx" now contains 3275230 row versions in 7390
pages
DETAIL: 18591 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.18s/0.66u sec elapsed 1.78 sec.
INFO: index "rid_toode_pattern_idx" now contains 3275230 row versions in
16310 pages
DETAIL: 17800 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.44s/1.04u sec elapsed 6.55 sec.
INFO: "rid": removed 38331 row versions in 3090 pages
DETAIL: CPU 0.20s/0.10u sec elapsed 5.49 sec.
INFO: "rid": found 38331 removable, 3275189 nonremovable row versions in
165282 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 1878923 unused item pointers.
0 pages are entirely empty.
CPU 5.06s/7.27u sec elapsed 607.59 sec.
Query returned successfully with no result in 1058319 ms.
- what's the size of the dataset relative to the RAM ?
Db size is 7417 MB
relevant table sizes in desc by size order:
1 40595 dok 2345 MB
2 1214 pg_shdepend 2259 MB
3 40926 rid 2057 MB
6 1232 pg_shdepend_depender_index 795 MB
7 1233 pg_shdepend_reference_index 438 MB
8 44286 dok_tasudok_idx 245 MB
9 44299 rid_toode_idx 243 MB
10 44283 dok_klient_idx 142 MB
11 19103791 rid_toode_pattern_idx 127 MB
14 44298 rid_inpdokumnr_idx 118 MB
15 44297 rid_dokumnr_idx 110 MB
16 43573 rid_pkey 109 MB
18 44288 dok_tasumata_idx 91 MB
19 44289 dok_tellimus_idx 89 MB
20 44284 dok_krdokumnr_idx 89 MB
21 44285 dok_kuupaev_idx 84 MB
23 43479 dok_pkey 74 MB
24 44282 dok_dokumnr_idx 74 MB
25 19076304 rid_rtellimus_idx 58 MB
26 18663923 dok_yksus_pattern_idx 43 MB
27 18801591 dok_sihtyksus_pattern_idx 42 MB
29 18774881 dok_doktyyp 30 MB
46 40967 toode 13 MB
server is HP Proliant DL320 G3
http://h18000.www1.hp.com/products/quickspecs/12169_ca/12169_ca.HTML
CPU is 2.93Ghz Celeron 256kb cache.
Server has 2 GB RAM.
It has SATA RAID 0,1 integrated controller (1.5Gbps) and SAMSUNG HD160JJ
mirrored disks.
Now let's look more closely at the query :
explain analyze
SELECT sum(1)
FROM dok JOIN rid USING (dokumnr)
JOIN toode USING (toode)
LEFT JOIN artliik using(grupp,liik)
WHERE rid.toode='X05' AND dok.kuupaev>='2008-09-01'
I presume doing the query without artliik changes nothing to the runtime,
yes ?
Yes. After removing artkliik from join I got response times 65 and 50
seconds, so this does not make difference.
Your problem here is that, no matter what, postgres will have to examine
- all rows where dok.kuupaev>='2008-09-01',
- and all rows where rid.toode = 'X05'.
If you use dok.kuupaev>='2007-09-01' (note : 2007) it will probably have
to scan many, many more rows.
Probably yes, since then it reads one year more sales data.
If you perform this query often you could CLUSTER rid on (toode) and dok
on (kuupaev), but this can screw other queries.
Some reports are by sales date (dok.kuupaev) and customers.
CLUSTER rid on (toode) slows them down. Also autovacuum cannot do
clustering.
What is the meaning of the columns ?
This is typical sales data:
-- Receipt headers:
DOK ( dokumnr INT SERIAL PRIMARY KEY,
kuupaev DATE --- sales date
)
-- Receipt details
RID ( dokumnr INT,
toode CHAR(20), -- item code
CONSTRAINT rid_dokumnr_fkey FOREIGN KEY (dokumnr) REFERENCES dok
(dokumnr),
CONSTRAINT rid_toode_fkey FOREIGN KEY (toode)
REFERENCES firma2.toode (toode)
)
-- Products
TOODE (
toode CHAR(20) PRIMARY KEY
)
To make this type of query faster I would tend to think about :
- denormalization (ie adding a column in one of your tables and a
multicolumn index)
For this query it is possible to duplicate kuupaev column to rid table.
However most of the this seems to go to scanning rid table, so I suspect
that this will help.
- materialized views
- materialized summary tables (ie. summary of sales for last month, for
instance)
There are about 1000 items and reports are different.
Andrus.
--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance