On Sat, Mar 21, 2015 at 6:01 AM, Roland Dunn <roland.dunn@xxxxxxxxx> wrote:
Hi,
Wonder if anyone can help.
Have a lookup table where the primary key is a native uuid type
(filled with uuid's of type 4), around 50m rows in size.
Have a separate table, table A, similar size (around 50m rows).
Primary key in table A is the standard integer, nextval, etc type
primary key. Table A also has a uuid column. The uuid column in table
A (native Postgres uuid type) has a "UNIQUE CONSTRAINT, btree (uuid)"
constraint on the uuid column.
Currently regularly running following set of queries:
1. Pull around 10,000 rows from lookup table.
2. Use uuid's from (1), to query table A.
Query (2) above, is running slowly. Typically around 40-50 seconds to
pull 8000-10,000 rows. - which is pretty slow. The table has various
other columns: 4 text fields, couple of JSON fields, so each row in
table A is fairly "fat" (if that's the correct _expression_).
Hi Roland,
It's very likely that the query is IO-bound.
Usual single SATA drive can perform around 100 IOPS/s.
As a result to fetch randomly spread 10000 rows HDD must spent ~100second which is pretty close to actual timings.
I suggest enable track_io_timing in postgresql.conf, and after use explain (analyze, costs, buffers, timing) instead of simple explain analyze. It will help you see time spend on the IO operations.
If your load are actually IO-bound I could suggest 3 possible ways make things better:
1)use good server grade ssd drive instead of hdd.
1)use good server grade ssd drive instead of hdd.
2)increase memory on the server so database could comfortable fit into the RAM.
3)use raid10 raid with good raid controller and 6-12 SAS drives.
The database could not retrieve rows faster than underlying file system could fetch data from hdd.
--
Maxim Boguk
Senior Postgresql DBA
http://www.postgresql-consulting.ru/
Senior Postgresql DBA
http://www.postgresql-consulting.ru/