Re: Failure to use indexes

Edmund Dengler <edmundd@xxxxxxxxxxxx> · Fri, 29 Jul 2005 14:23:29 -0400 (EDT)

Greetings!

I managed to get an even simpler example:

select *
from eventlog.record_main
where luid in (
    select luid from eventlog_partition._20050706__raw_record
    order by luid limit 1
  )
;

If I use <=> rather than <in>, postgresql uses index scanning. As soon as
I use <in> (ie, a set to compare against), I get sequential scanning,
event though the set size is only a single element.

Regards!
Ed

On Fri, 29 Jul 2005, Edmund Dengler wrote:

> Greetings!
>
> I am using <inherits> to partition several tables. When I perform a query
> on another table, and then try to join against an inherited table set, the
> optimizer does not use any indexes to perform the join.
>
> This is 8.0.1. Is this fixed in 8.0.2 or 8.0.3?
>
> The query:
>
> explain
> select *
> from (
>     select *  from eventlog.record_classification as record_classification
>     where
>       time_written >= '2005-07-06 00:00:00+00'::timestamptz
>       and time_written < '2005-07-06 00:00:00+00'::timestamptz
>     order by time_written, luid
>     offset  0
>     limit  500
>   ) as classification
>   join eventlog.record_main as main using (luid, time_written)
> ;
>
> The explanation:
>
>                                                                                                 QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Hash Join  (cost=51.15..20191003.89 rows=208027 width=178)
>    Hash Cond: (("outer".luid = "inner".luid) AND ("outer".time_written = "inner".time_written))
>    ->  Append  (cost=0.00..14641488.64 rows=554738383 width=96)
>          ->  Seq Scan on record_main main  (cost=0.00..0.00 rows=1 width=96)
>          ->  Seq Scan on _20050723__record_main main  (cost=0.00..94078.62 rows=3564462 width=96)
>          ->  Seq Scan on _20050724__record_main main  (cost=0.00..110075.12 rows=4170512 width=96)
>          ->  Seq Scan on _20050725__record_main main  (cost=0.00..122836.02 rows=4654002 width=96)
>          ->  Seq Scan on _20050726__record_main main  (cost=0.00..142347.71 rows=5393271 width=96)
>          ->  Seq Scan on _20050727__record_main main  (cost=0.00..130858.80 rows=4957980 width=96)
>          ....
>          (and so on, currently 123 such inheritd tables)
>          ....
>    ->  Hash  (cost=51.07..51.07 rows=15 width=98)
>          ->  Subquery Scan classification  (cost=50.89..51.07 rows=15 width=98)
>                ->  Limit  (cost=50.89..50.92 rows=15 width=98)
>                      ->  Sort  (cost=50.89..50.92 rows=15 width=98)
>                            Sort Key: record_classification.time_written, record_classification.luid
>                            ->  Result  (cost=0.00..50.59 rows=15 width=98)
>                                  ->  Append  (cost=0.00..50.59 rows=15 width=98)
>                                        ->  Seq Scan on record_classification  (cost=0.00..0.00 rows=1 width=98)
>                                              Filter: ((time_written >= '2005-07-06 00:00:00+00'::timestamp with time zone) AND (time_written < '2005-07-06 00:00:00+00'::timestamp with time zone))
>                                        ->  Index Scan using _20050705__record_classification_time_written_idx on _20050705__record_classification record_classification  (cost=0.00..3.46 rows=1 width=54)
>                                              Index Cond: ((time_written >= '2005-07-06 00:00:00+00'::timestamp with time zone) AND (time_written < '2005-07-06 00:00:00+00'::timestamp with time zone))
>                                        ->  Index Scan using _20050701__record_classification_time_written_idx on _20050701__record_classification record_classification  (cost=0.00..3.59 rows=1 width=54)
>                                              Index Cond: ((time_written >= '2005-07-06 00:00:00+00'::timestamp with time zone) AND (time_written < '2005-07-06 00:00:00+00'::timestamp with time zone))
>                                        ->  Index Scan using _20050702__record_classification_time_written_idx on _20050702__record_classification record_classification  (cost=0.00..3.69 rows=1 width=54)
>                                              Index Cond: ((time_written >= '2005-07-06 00:00:00+00'::timestamp with time zone) AND (time_written < '2005-07-06 00:00:00+00'::timestamp with time zone))
>                                        ->  Index Scan using _20050703__record_classification_time_written_idx on _20050703__record_classification record_classification  (cost=0.00..3.70 rows=1 width=54)
>                                              Index Cond: ((time_written >= '2005-07-06 00:00:00+00'::timestamp with time zone) AND (time_written < '2005-07-06 00:00:00+00'::timestamp with time zone))
>                                        ....
>                                        (and so on)
>                                        ....
>                                        ->  Index Scan using _20050714__record_classification_time_written_idx on _20050714__record_classification record_classification  (cost=0.00..3.69 rows=1 width=53)
>                                              Index Cond: ((time_written >= '2005-07-06 00:00:00+00'::timestamp with time zone) AND (time_written < '2005-07-06 00:00:00+00'::timestamp with time zone))
> (164 rows)
>
> Sample tables:
>
> eventlog=# \d eventlog_partition._20050723__record_main
>       Table "eventlog_partition._20050723__record_main"
>         Column        |           Type           | Modifiers
> ----------------------+--------------------------+-----------
>  luid                 | bigint                   | not null
>  host_luid            | integer                  | not null
>  log_luid             | integer                  | not null
>  time_logged          | timestamp with time zone | not null
>  record_number        | bigint                   | not null
>  time_generated_epoch | bigint                   | not null
>  time_generated       | timestamp with time zone | not null
>  time_written_epoch   | bigint                   | not null
>  time_written         | timestamp with time zone | not null
>  event_id             | bigint                   | not null
>  event_type           | integer                  | not null
>  event_category       | integer                  | not null
>  source_luid          | integer                  | not null
>  computer_luid        | integer                  | not null
>  sid_luid             | integer                  |
>  message_luid         | integer                  |
> Indexes:
>     "_20050723__record_main_message_idx" UNIQUE, btree (message_luid, luid)
>     "_20050723__record_main_sid_idx" UNIQUE, btree (sid_luid, luid)
>     "_20050723__record_main_time_generated_idx" UNIQUE, btree (time_generated, luid)
>     "_20050723__record_main_time_logged_idx" UNIQUE, btree (time_logged, luid)
>     "_20050723__record_main_time_written_idx" UNIQUE, btree (time_written, luid)
>     "_20050723__record_main_pkey" btree (luid)
> Inherits: record_main
>
>
> eventlog=# \d eventlog.record_classification
>         Table "eventlog.record_classification"
>      Column     |           Type           | Modifiers
> ----------------+--------------------------+-----------
>  luid           | bigint                   | not null
>  class_luid     | integer                  | not null
>  time_written   | timestamp with time zone | not null
>  account_luid   | integer                  |
>  group_luid     | integer                  |
>  caller_luid    | integer                  |
>  source_machine | character varying(30)    |
>  source_ip      | character varying(30)    |
> Indexes:
>     "record_classification_pkey" PRIMARY KEY, btree (luid)
>     "record_classification_account_idx" btree (account_luid, time_written)
>     "record_classification_caller_idx" btree (caller_luid, time_written)
>     "record_classification_class_idx" btree (class_luid, time_written)
>     "record_classification_group_idx" btree (group_luid, time_written)
>     "record_classification_ip_idx" btree (source_ip, time_written)
>     "record_classification_machine_idx" btree (source_machine, time_written)
> Foreign-key constraints:
>     "record_classification_class_luid_fkey" FOREIGN KEY (class_luid) REFERENCES eventlog.classification(luid)
>     "record_classification_account_luid_fkey" FOREIGN KEY (account_luid) REFERENCES eventlog.account(luid)
>     "record_classification_group_luid_fkey" FOREIGN KEY (group_luid) REFERENCES eventlog.account(luid)
>     "record_classification_caller_luid_fkey" FOREIGN KEY (caller_luid) REFERENCES eventlog.account(luid)
>
>
> <luid> represents a unique key. I have tried the query using just <luid>
> as the join condition, same result.
>
> The system can use the <_XXXXXXXX__record_main_pkey> index on each
> <_XXXXXXXX__record_main> table to do an index scan on <luid>, but instead
> it chooses to do a sequential scan.  This is true whether
> <_XXXXXXXX__record_main_pkey> is specified as unique or not.
>
> I have "vacuum analyze" the entire database before running the queries. I
> have set the <default_statistics_target> in postgresql.conf to 100 to
> obtain more accurate statistics.
>
> If I specify a specific sub-table (ie,
> <eventlog_partition._XXXXXXXX__record_main>), in the join, the optimizer
> uses the indexes to speed search.
>
> Any ideas?
>
> Regards!
> Ed
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@xxxxxxxxxxxxxx so that your
>        message can get through to the mailing list cleanly
>

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
       subscribe-nomail command to majordomo@xxxxxxxxxxxxxx so that your
       message can get through to the mailing list cleanly