The GROUP BY was the fastest method. Thanks for the suggestions, Keaton On 2/15/08 3:12 PM, "Gregory Stark" <stark@xxxxxxxxxxxxxxxx> wrote: > "Keaton Adams" <kadams@xxxxxxxxxxx> writes: > >> Version: Postgres 8.1.4 >> Platform: RHEL >> >> Given this scenario with the indexes in place, when I ask for the distinct >> field1_id values, why does the optimizer choose a sequential scan instead of >> just reading from the kda_log_fid_cre_20080123_idx index? The time it takes >> to perform the sequential scan against 20+ million records is way too slow. > > Try (temporarily) doing: > > SET enable_seqscan = off; > >> keaton=# explain select distinct field1_id into temp kda_temp from >> kda_log_20080213; > > If the database is right that will be even slower. Using a full index scan > requires a lot of random access seeks, generally the larger the table the > *more* likely a sequential scan and sort is a better approach than using an > index. > > If it's wrong and it's faster then you have to consider whether it's only > faster because you've read the table into cache already. Will it be in cache > in production? If so then you migth try raising effective_cache_size or > lowering random_page_cost. > > Another thing to try is using GROUP BY instead of DISTINCT. This is one case > where the postgres optimizer doesn't handle the two equivalent cases in > exactly the same way and there are some plans available in one method that > aren't in the other. That's only likely to help if you have relative few > values of field1_id but it's worth trying. ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@xxxxxxxxxxxxxx so that your message can get through to the mailing list cleanly