We have two related tables with event types and events. We query for a join
between these two tables and experience that, when there is an
to-be-expected very small result set, this query performs particularly
poor. Understanding in this matter would be appreciated.
SELECT * from events_event_types WHERE id IN (71,999);
id | name | severity
----+------------------------+----------
71 | Xenteo Payment handled | 20
(1 row)
Following original query returns zero rows (as to be expected on what I
showed above) and takes (relatively) a lot of time doing so:
SELECT * FROM events_events LEFT OUTER JOIN events_event_types ON
eventType_id=events_event_types.id WHERE severity=70 AND (eventType_id IN
(71)) ORDER BY datetime DESC LIMIT 50;
id | carparkid | cleared | datetime | identity | generatedbystationid |
eventtype_id | relatedstationid | processingstatus | id | name | severity
----+-----------+---------+---
-------+----------+----------------------+--------------+------------------+------------------+----+------+----------
(0 rows)
Time: 397.564 ms
Following query is much alike the original query, but I changed the "WHERE
severity". It returns the number of rows are requested in LIMIT and takes
only little time doing so:
SELECT * FROM events_events LEFT OUTER JOIN events_event_types ON
eventType_id=events_event_types.id WHERE severity=20 AND (eventType_id IN
(71)) ORDER BY datetime DESC limit 50;
...
(50 rows)
Time: 1.604 ms
The latter much to prove that this is a problem related to small result
sets.
Following query is much alike the original query, although I've added a
dummy value (non-existent in event types table; "999") to the WHERE IN
clause. It returns the same zero rows and takes only little time doing so:
SELECT * FROM events_events LEFT OUTER JOIN events_event_types ON
eventType_id=events_event_types.id WHERE severity=70 AND (eventType_id IN
(71, 999)) ORDER BY datetime DESC LIMIT 50;
id | carparkid | cleared | datetime | identity | generatedbystationid |
eventtype_id | relatedstationid | processingstatus | id | name | severity
----+-----------+---------+----------+----------+----------------------+--------------+------------------+------------------+----+------+----------
(0 rows)
Time: 1.340 ms
Now I have at least two possibilities:
- Implementing the dummy value as shown above in my source code to improve
query performance (dirty but effective)
- Further investigating what is going on, which at this point is something
I need help with
Thanks for your assistance in this matter!
Following are a number of details to describe the environment that this is
seen in.
SELECT version();
PostgreSQL 8.3.7 on i486-pc-linux-gnu, compiled by GCC cc (GCC) 4.2.4
(Ubuntu 4.2.4-1ubuntu3)
Postgres was installed as Debian package in Ubuntu 8.04 LTS.
SELECT count(*) FROM events_events;
7619991
SELECT count(*) FROM events_events WHERE eventtype_id=71;
50348
SELECT count(*) FROM events_event_types;
82
\d events_event_types
Table "public.events_event_types"
Column | Type | Modifiers
----------+------------------------+-----------------------------------------------------------------
id | bigint | not null default nextval
('events_event_types_id_seq'::regclass)
name | character varying(255) | not null
severity | bigint | not null
Indexes:
"events_event_types_pkey" PRIMARY KEY, btree (id)
"events_event_types_name_key" UNIQUE, btree (name)
"events_event_types_severity_ind" btree (severity)
"test_events_eventtypes_id_severity_ind" btree (id, severity)
"test_events_eventtypes_severity_id_ind" btree (severity, id)
\d events_events
Table "public.events_events"
Column | Type |
Modifiers
----------------------+--------------------------+------------------------------------------------------------
id | bigint | not null default nextval
('events_events_id_seq'::regclass)
carparkid | bigint |
cleared | boolean | not null
datetime | timestamp with time zone |
identity | character varying(255) |
generatedbystationid | bigint |
eventtype_id | bigint | not null
relatedstationid | bigint |
processingstatus | character varying(255) | not null
Indexes:
"events_events_pkey" PRIMARY KEY, btree (id)
"events_events_cleared_ind" btree (cleared)
"events_events_datetime_eventtype_id_ind" btree (datetime,
eventtype_id)
"events_events_datetime_ind" btree (datetime)
"events_events_eventtype_id_datetime_ind" btree (eventtype_id,
datetime)
"events_events_eventtype_id_ind" btree (eventtype_id)
"events_events_identity_ind" btree (identity)
"events_events_not_cleared_ind" btree (cleared) WHERE NOT cleared
"events_events_processingstatus_new" btree (processingstatus) WHERE
processingstatus::text = 'NEW'::text
"test2_events_events_eventtype_id_severity_ind" btree (datetime,
eventtype_id, cleared)
"test3_events_events_eventtype_id_severity_ind" btree (cleared,
datetime, eventtype_id)
"test4_events_events_eventtype_id_severity_ind" btree (datetime,
cleared, eventtype_id)
"test5_events_events_eventtype_id_severity_ind" btree (datetime,
cleared)
"test_events_events_eventtype_id_severity_ind" btree (eventtype_id,
cleared)
Foreign-key constraints:
"fk88fe3effa0559276" FOREIGN KEY (eventtype_id) REFERENCES
events_event_types(id)
Can someone explain this behaviour?
Thanks in advance!
Best regards,
(0 rows)
Time: 397.564 ms
Following query is much alike the original query, but I changed the "WHERE
severity". It returns the number of rows are requested in LIMIT and takes
only little time doing so:
SELECT * FROM events_events LEFT OUTER JOIN events_event_types ON
eventType_id=events_event_types.id WHERE severity=20 AND (eventType_id IN
(71)) ORDER BY datetime DESC limit 50;
...
(50 rows)
Time: 1.604 ms
The latter much to prove that this is a problem related to small result
sets.
Following query is much alike the original query, although I've added a
dummy value (non-existent in event types table; "999") to the WHERE IN
clause. It returns the same zero rows and takes only little time doing so:
SELECT * FROM events_events LEFT OUTER JOIN events_event_types ON
eventType_id=events_event_types.id WHERE severity=70 AND (eventType_id IN
(71, 999)) ORDER BY datetime DESC LIMIT 50;
id | carparkid | cleared | datetime | identity | generatedbystationid |
eventtype_id | relatedstationid | processingstatus | id | name | severity
----+-----------+---------+----------+----------+----------------------+--------------+------------------+------------------+----+------+----------
(0 rows)
Time: 1.340 ms
Now I have at least two possibilities:
- Implementing the dummy value as shown above in my source code to improve
query performance (dirty but effective)
- Further investigating what is going on, which at this point is something
I need help with
Thanks for your assistance in this matter!
Following are a number of details to describe the environment that this is
seen in.
SELECT version();
PostgreSQL 8.3.7 on i486-pc-linux-gnu, compiled by GCC cc (GCC) 4.2.4
(Ubuntu 4.2.4-1ubuntu3)
Postgres was installed as Debian package in Ubuntu 8.04 LTS.
SELECT count(*) FROM events_events;
7619991
SELECT count(*) FROM events_events WHERE eventtype_id=71;
50348
SELECT count(*) FROM events_event_types;
82
\d events_event_types
Table "public.events_event_types"
Column | Type | Modifiers
----------+------------------------+-----------------------------------------------------------------
id | bigint | not null default nextval
('events_event_types_id_seq'::regclass)
name | character varying(255) | not null
severity | bigint | not null
Indexes:
"events_event_types_pkey" PRIMARY KEY, btree (id)
"events_event_types_name_key" UNIQUE, btree (name)
"events_event_types_severity_ind" btree (severity)
"test_events_eventtypes_id_severity_ind" btree (id, severity)
"test_events_eventtypes_severity_id_ind" btree (severity, id)
\d events_events
Table "public.events_events"
Column | Type |
Modifiers
----------------------+--------------------------+------------------------------------------------------------
id | bigint | not null default nextval
('events_events_id_seq'::regclass)
carparkid | bigint |
cleared | boolean | not null
datetime | timestamp with time zone |
identity | character varying(255) |
generatedbystationid | bigint |
eventtype_id | bigint | not null
relatedstationid | bigint |
processingstatus | character varying(255) | not null
Indexes:
"events_events_pkey" PRIMARY KEY, btree (id)
"events_events_cleared_ind" btree (cleared)
"events_events_datetime_eventtype_id_ind" btree (datetime,
eventtype_id)
"events_events_datetime_ind" btree (datetime)
"events_events_eventtype_id_datetime_ind" btree (eventtype_id,
datetime)
"events_events_eventtype_id_ind" btree (eventtype_id)
"events_events_identity_ind" btree (identity)
"events_events_not_cleared_ind" btree (cleared) WHERE NOT cleared
"events_events_processingstatus_new" btree (processingstatus) WHERE
processingstatus::text = 'NEW'::text
"test2_events_events_eventtype_id_severity_ind" btree (datetime,
eventtype_id, cleared)
"test3_events_events_eventtype_id_severity_ind" btree (cleared,
datetime, eventtype_id)
"test4_events_events_eventtype_id_severity_ind" btree (datetime,
cleared, eventtype_id)
"test5_events_events_eventtype_id_severity_ind" btree (datetime,
cleared)
"test_events_events_eventtype_id_severity_ind" btree (eventtype_id,
cleared)
Foreign-key constraints:
"fk88fe3effa0559276" FOREIGN KEY (eventtype_id) REFERENCES
events_event_types(id)
Can someone explain this behaviour?
Thanks in advance!
Best regards,
--
Squins | IT, Honestly
Oranjestraat 23
2983 HL Ridderkerk
The Netherlands
Phone: +31 (0)180 414520
Mobile: +31 (0)6 30413841
www.squins.com
http://twitter.com/keesvandieren
Chamber of commerce Rotterdam: 22048547