Greetings! I managed to get an even simpler example: select * from eventlog.record_main where luid in ( select luid from eventlog_partition._20050706__raw_record order by luid limit 1 ) ; If I use <=> rather than <in>, postgresql uses index scanning. As soon as I use <in> (ie, a set to compare against), I get sequential scanning, event though the set size is only a single element. Regards! Ed On Fri, 29 Jul 2005, Edmund Dengler wrote: > Greetings! > > I am using <inherits> to partition several tables. When I perform a query > on another table, and then try to join against an inherited table set, the > optimizer does not use any indexes to perform the join. > > This is 8.0.1. Is this fixed in 8.0.2 or 8.0.3? > > The query: > > explain > select * > from ( > select * from eventlog.record_classification as record_classification > where > time_written >= '2005-07-06 00:00:00+00'::timestamptz > and time_written < '2005-07-06 00:00:00+00'::timestamptz > order by time_written, luid > offset 0 > limit 500 > ) as classification > join eventlog.record_main as main using (luid, time_written) > ; > > The explanation: > > QUERY PLAN > ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > Hash Join (cost=51.15..20191003.89 rows=208027 width=178) > Hash Cond: (("outer".luid = "inner".luid) AND ("outer".time_written = "inner".time_written)) > -> Append (cost=0.00..14641488.64 rows=554738383 width=96) > -> Seq Scan on record_main main (cost=0.00..0.00 rows=1 width=96) > -> Seq Scan on _20050723__record_main main (cost=0.00..94078.62 rows=3564462 width=96) > -> Seq Scan on _20050724__record_main main (cost=0.00..110075.12 rows=4170512 width=96) > -> Seq Scan on _20050725__record_main main (cost=0.00..122836.02 rows=4654002 width=96) > -> Seq Scan on _20050726__record_main main (cost=0.00..142347.71 rows=5393271 width=96) > -> Seq Scan on _20050727__record_main main (cost=0.00..130858.80 rows=4957980 width=96) > .... > (and so on, currently 123 such inheritd tables) > .... > -> Hash (cost=51.07..51.07 rows=15 width=98) > -> Subquery Scan classification (cost=50.89..51.07 rows=15 width=98) > -> Limit (cost=50.89..50.92 rows=15 width=98) > -> Sort (cost=50.89..50.92 rows=15 width=98) > Sort Key: record_classification.time_written, record_classification.luid > -> Result (cost=0.00..50.59 rows=15 width=98) > -> Append (cost=0.00..50.59 rows=15 width=98) > -> Seq Scan on record_classification (cost=0.00..0.00 rows=1 width=98) > Filter: ((time_written >= '2005-07-06 00:00:00+00'::timestamp with time zone) AND (time_written < '2005-07-06 00:00:00+00'::timestamp with time zone)) > -> Index Scan using _20050705__record_classification_time_written_idx on _20050705__record_classification record_classification (cost=0.00..3.46 rows=1 width=54) > Index Cond: ((time_written >= '2005-07-06 00:00:00+00'::timestamp with time zone) AND (time_written < '2005-07-06 00:00:00+00'::timestamp with time zone)) > -> Index Scan using _20050701__record_classification_time_written_idx on _20050701__record_classification record_classification (cost=0.00..3.59 rows=1 width=54) > Index Cond: ((time_written >= '2005-07-06 00:00:00+00'::timestamp with time zone) AND (time_written < '2005-07-06 00:00:00+00'::timestamp with time zone)) > -> Index Scan using _20050702__record_classification_time_written_idx on _20050702__record_classification record_classification (cost=0.00..3.69 rows=1 width=54) > Index Cond: ((time_written >= '2005-07-06 00:00:00+00'::timestamp with time zone) AND (time_written < '2005-07-06 00:00:00+00'::timestamp with time zone)) > -> Index Scan using _20050703__record_classification_time_written_idx on _20050703__record_classification record_classification (cost=0.00..3.70 rows=1 width=54) > Index Cond: ((time_written >= '2005-07-06 00:00:00+00'::timestamp with time zone) AND (time_written < '2005-07-06 00:00:00+00'::timestamp with time zone)) > .... > (and so on) > .... > -> Index Scan using _20050714__record_classification_time_written_idx on _20050714__record_classification record_classification (cost=0.00..3.69 rows=1 width=53) > Index Cond: ((time_written >= '2005-07-06 00:00:00+00'::timestamp with time zone) AND (time_written < '2005-07-06 00:00:00+00'::timestamp with time zone)) > (164 rows) > > Sample tables: > > eventlog=# \d eventlog_partition._20050723__record_main > Table "eventlog_partition._20050723__record_main" > Column | Type | Modifiers > ----------------------+--------------------------+----------- > luid | bigint | not null > host_luid | integer | not null > log_luid | integer | not null > time_logged | timestamp with time zone | not null > record_number | bigint | not null > time_generated_epoch | bigint | not null > time_generated | timestamp with time zone | not null > time_written_epoch | bigint | not null > time_written | timestamp with time zone | not null > event_id | bigint | not null > event_type | integer | not null > event_category | integer | not null > source_luid | integer | not null > computer_luid | integer | not null > sid_luid | integer | > message_luid | integer | > Indexes: > "_20050723__record_main_message_idx" UNIQUE, btree (message_luid, luid) > "_20050723__record_main_sid_idx" UNIQUE, btree (sid_luid, luid) > "_20050723__record_main_time_generated_idx" UNIQUE, btree (time_generated, luid) > "_20050723__record_main_time_logged_idx" UNIQUE, btree (time_logged, luid) > "_20050723__record_main_time_written_idx" UNIQUE, btree (time_written, luid) > "_20050723__record_main_pkey" btree (luid) > Inherits: record_main > > > eventlog=# \d eventlog.record_classification > Table "eventlog.record_classification" > Column | Type | Modifiers > ----------------+--------------------------+----------- > luid | bigint | not null > class_luid | integer | not null > time_written | timestamp with time zone | not null > account_luid | integer | > group_luid | integer | > caller_luid | integer | > source_machine | character varying(30) | > source_ip | character varying(30) | > Indexes: > "record_classification_pkey" PRIMARY KEY, btree (luid) > "record_classification_account_idx" btree (account_luid, time_written) > "record_classification_caller_idx" btree (caller_luid, time_written) > "record_classification_class_idx" btree (class_luid, time_written) > "record_classification_group_idx" btree (group_luid, time_written) > "record_classification_ip_idx" btree (source_ip, time_written) > "record_classification_machine_idx" btree (source_machine, time_written) > Foreign-key constraints: > "record_classification_class_luid_fkey" FOREIGN KEY (class_luid) REFERENCES eventlog.classification(luid) > "record_classification_account_luid_fkey" FOREIGN KEY (account_luid) REFERENCES eventlog.account(luid) > "record_classification_group_luid_fkey" FOREIGN KEY (group_luid) REFERENCES eventlog.account(luid) > "record_classification_caller_luid_fkey" FOREIGN KEY (caller_luid) REFERENCES eventlog.account(luid) > > > <luid> represents a unique key. I have tried the query using just <luid> > as the join condition, same result. > > The system can use the <_XXXXXXXX__record_main_pkey> index on each > <_XXXXXXXX__record_main> table to do an index scan on <luid>, but instead > it chooses to do a sequential scan. This is true whether > <_XXXXXXXX__record_main_pkey> is specified as unique or not. > > I have "vacuum analyze" the entire database before running the queries. I > have set the <default_statistics_target> in postgresql.conf to 100 to > obtain more accurate statistics. > > If I specify a specific sub-table (ie, > <eventlog_partition._XXXXXXXX__record_main>), in the join, the optimizer > uses the indexes to speed search. > > Any ideas? > > Regards! > Ed > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@xxxxxxxxxxxxxx so that your > message can get through to the mailing list cleanly > ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@xxxxxxxxxxxxxx so that your message can get through to the mailing list cleanly