NestedLoops over BitmapScan question

Виктор Егоров <vyegorov@xxxxxxxxx> · Sat, 29 Sep 2012 02:11:58 +0300

Greetings.
I have a small monitoring query on the following tables:
select relname,relpages,reltuples::numeric(12) from pg_class where relname in ('meta_version','account') order by 1;
   relname    | relpages | reltuples 
--------------+----------+-----------
 account      |     3235 |    197723
 meta_version |   710068 |  32561200
(2 rows)

The logical “body” of the query is:
select count(*) from meta_version where account_id in (select account_id from account where customer_id = 8608064);

I know that due to the data distribution (above customer's accounts are used in 45% of the meta_version table) I

cannot expect fast results. But I have another question.

With default default_statistics_target I get the following plan:

http://explain.depesz.com/s/jri

In order to get better estimates, I've increased statistics targets to 200 for account.customer_id and meta_version.account_id.
Now I have the following plan:
http://explain.depesz.com/s/YZJ

Second query takes twice more time.
My questions are:

- why with better statistics planner chooses to do a SeqScan in favor of BitmapIndexScan inside the NestedLoops?
- is it possible to adjust this decision by changing other GUCs, perhaps costs?
- would it be correct to adjust seq_page_cost and random_page_cost based on the IOPS of the underlying disks?
  any other metrics should be considered?

I'm running on a:
            name            |                                                current_setting                                                
----------------------------+---------------------------------------------------------------------------------------------------------------

 version                    | PostgreSQL 9.1.6 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-52), 64-bit
 archive_command            | test ! -f $PG_WAL/%f && cp %p $PG_WAL/%f
 archive_mode               | on
 bgwriter_delay             | 50ms
 bgwriter_lru_maxpages      | 200
 checkpoint_segments        | 25
 checkpoint_timeout         | 30min
 client_encoding            | UTF8
 effective_cache_size       | 8GB
 hot_standby                | on
 lc_collate                 | en_US.UTF-8
 lc_ctype                   | en_US.UTF-8
 listen_addresses           | *
 log_checkpoints            | on
 log_connections            | on
 log_destination            | csvlog
 log_directory              | ../../log/CLUSTER
 log_disconnections         | on
 log_file_mode              | 0640
 log_filename               | pg-%Y%m%d_%H%M%S.log
 log_line_prefix            | %u:%d:%a:%h:%c:%x:%t> 
 log_lock_waits             | on
 log_min_duration_statement | 300ms
 log_rotation_age           | 1d
 log_rotation_size          | 0
 log_temp_files             | 20MB
 logging_collector          | on
 maintenance_work_mem       | 512MB
 max_connections            | 200
 max_prepared_transactions  | 0
 max_stack_depth            | 2MB
 max_wal_senders            | 2
 port                       | 9120
 server_encoding            | UTF8
 shared_buffers             | 5GB
 silent_mode                | on
 ssl                        | on
 ssl_renegotiation_limit    | 0
 tcp_keepalives_idle        | 0
 temp_buffers               | 256MB
 TimeZone                   | US/Eastern
 wal_buffers                | 512kB
 wal_keep_segments          | 0
 wal_level                  | hot_standby
 wal_sender_delay           | 1s
 work_mem                   | 32MB

Regards.

-- 
Victor Y. Yegorov