I'm wondering (given what I have read about PG and IN), if there is a better way to execute this query. I have (essentially) two tables in a read-only DB, ip_info and events, where many events can map to one ip_info. I'd like to get all the ip_info.info columns where a specific event occurred. This is about what I have come up with (with indexes on all the columns): CREATE TABLE ip_info ( ip IP4, info VARCHAR ); CREATE TABLE events ( ip IP4, event_name VARCHAR, event_type VARCHAR); SELECT ip, info FROM ip_info WHERE ip IN (SELECT ip FROM events WHERE event_name = 'somename'); This works fine when there are few events named 'somename', however this subquery can return a fairly large set of rows (>1 million) and this query takes quite long to execute. Is there a better way to write this query? What would be the optimal plan for this query, given a roughly even distribution of event_name? My current plan looks something like (as I have partitioned the events table by ip): Hash Join Hash Cond ("outer".ip = "inner".ip) -> Seq Scan on ip_info -> Hash -> HashAggregate -> Append -> Index Scan using "events_ip_01_event_name_idx" on events_ip_01 events Index Cond ... -> Index Scan using "events_ip_02_event_name_idx" on events_ip_02 events Index Cond ... Is this the optimal plan for this query? BTW, ip_info has about 5M rows, and the collective events tables have about 50M rows. Also, slightly off-topic ... are there any optimizations that I can use to tell PG that this is a read-only database? PG 8.1.3 on RHEL4.3 x86_64 ... thinking about upgrading to 8.2 when I get the time. Thanks!