On Thu, 2005-11-10 at 12:23 -0500, Tom Lane wrote: > Apparently, you are using a platform and/or locale in which strcoll() is > spectacularly, god-awfully slow --- on the order of 10 msec per comparison. The version with the condition is definitely doing more I/O. The version without the condition doesn't read at all. I strace'd an explain analyze for each separately, and this is what I ended up with (the first is with the condition, the second is without): bash-2.05b$ cut '-d(' -f1 subsourcestrace | sort | uniq -c 7127 gettimeofday 75213 _llseek 1 Process 30227 attached - interrupt to quit 1 Process 30227 detached 148671 read 2 recv 4 semop 4 send bash-2.05b$ cut '-d(' -f1 subsourcestrace-nocond | sort | uniq -c 9103 gettimeofday 7 _llseek 1 Process 30227 attached - interrupt to quit 1 Process 30227 detached 2 recv 4 send For the moment, all of the rows in the view I'm selecting from satisfy the condition, so the output of both queries is the same. The relevant rows of the underlying tables are probably pretty contiguous (all of the rows satisfying the condition and the join were inserted at the same time). Could it just be the result of a weird physical distribution of data in the table/index files? For the fast query, the actual number of rows is a lot less than the planner expects. > This is a bit hard to believe but I can't make sense of those numbers > any other way. What is the platform exactly, and what database locale > and encoding are you using? It's RHEL 3 on x86: [root@rehoboam root]# uname -a Linux rehoboam 2.4.21-32.0.1.ELsmp #1 SMP Tue May 17 17:52:23 EDT 2005 i686 i686 i386 GNU/Linux The glibc version is 2.3.2. statgen=# select current_setting('lc_collate'); current_setting ----------------- en_US.UTF-8 Not sure what's relevant, but here's some more info: The machine has 4.5GiB of RAM and a 5-disk Raid 5. It's a dual xeon 3.2ghz. relname | relpages | reltuples -----------------------------+----------+------------- external_id_map | 126883 | 1.55625e+07 external_id_map_primary_key | 64607 | 1.55625e+07 subject | 31 | 1186 subject_pkey | 19 | 1186 I've attached the output of "select name, setting from pg_settings". And, in case my original message isn't handy, the explain analyze output and table/view info is below. Thanks for taking a look, Mitch statgen=> explain analyze select * from subject_source; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Merge Join (cost=0.00..330.72 rows=1186 width=46) (actual time=0.051..8.890 rows=1186 loops=1) Merge Cond: ("outer".id = "inner".target_id) -> Index Scan using subject_pkey on subject norm (cost=0.00..63.36 rows=1186 width=28) (actual time=0.022..1.441 rows=1186 loops=1) -> Index Scan using external_id_map_primary_key on external_id_map eim (cost=0.00..2485226.70 rows=15562513 width=26) (actual time=0.016..2.532 rows=2175 loops=1) Total runtime: 9.592 ms (5 rows) statgen=> explain analyze select * from subject_source where source='SCH'; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Merge Join (cost=0.00..1147.33 rows=1 width=46) (actual time=0.054..20258.161 rows=1186 loops=1) Merge Cond: ("outer".id = "inner".target_id) -> Index Scan using subject_pkey on subject norm (cost=0.00..63.36 rows=1186 width=28) (actual time=0.022..1.478 rows=1186 loops=1) -> Index Scan using external_id_map_primary_key on external_id_map eim (cost=0.00..2524132.99 rows=2335 width=26) (actual time=0.022..20252.326 rows=1186 loops=1) Filter: (source = 'SCH'::bpchar) Total runtime: 20258.922 ms (6 rows) statgen=> \d subject_source View "public.subject_source" Column | Type | Modifiers -----------+-----------------------+----------- id | bigint | sex | integer | parent1 | bigint | parent2 | bigint | source | character(3) | source_id | character varying(32) | View definition: SELECT norm.id, norm.sex, norm.parent1, norm.parent2, eim.source, eim.source_id FROM subject norm JOIN util.external_id_map eim ON norm.id = eim.target_id; statgen=> \d subject Table "public.subject" Column | Type | Modifiers ---------+---------+----------- id | bigint | not null sex | integer | parent1 | bigint | parent2 | bigint | Indexes: "subject_pkey" PRIMARY KEY, btree (id) Foreign-key constraints: "subject_parent1" FOREIGN KEY (parent1) REFERENCES subject(id) DEFERRABLE INITIALLY DEFERRED "subject_parent2" FOREIGN KEY (parent2) REFERENCES subject(id) DEFERRABLE INITIALLY DEFERRED "subject_id_map" FOREIGN KEY (id) REFERENCES util.external_id_map(target_id) DEFERRABLE INITIALLY DEFERRED statgen=> \d util.external_id_map Table "util.external_id_map" Column | Type | Modifiers -----------+-----------------------+----------- source_id | character varying(32) | not null source | character(3) | not null target_id | bigint | not null Indexes: "external_id_map_primary_key" PRIMARY KEY, btree (target_id) "external_id_map_source_source_id_unique" UNIQUE, btree (source, source_id) "external_id_map_source" btree (source) "external_id_map_source_target_id" btree (source, target_id) Foreign-key constraints: "external_id_map_source" FOREIGN KEY (source) REFERENCES util.source(id)
name | setting --------------------------------+---------------- add_missing_from | on archive_command | unset australian_timezones | off authentication_timeout | 60 bgwriter_delay | 200 bgwriter_maxpages | 100 bgwriter_percent | 1 block_size | 8192 check_function_bodies | on checkpoint_segments | 32 checkpoint_timeout | 300 checkpoint_warning | 60 client_encoding | SQL_ASCII client_min_messages | notice commit_delay | 0 commit_siblings | 5 cpu_index_tuple_cost | 0.001 cpu_operator_cost | 0.0025 cpu_tuple_cost | 0.01 custom_variable_classes | unset DateStyle | ISO, MDY db_user_namespace | off deadlock_timeout | 1000 debug_pretty_print | off debug_print_parse | off debug_print_plan | off debug_print_rewritten | off debug_shared_buffers | 0 default_statistics_target | 10 default_tablespace | unset default_transaction_isolation | read committed default_transaction_read_only | off default_with_oids | off effective_cache_size | 65536 enable_hashagg | on enable_hashjoin | on enable_indexscan | on enable_mergejoin | on enable_nestloop | on enable_seqscan | on enable_sort | on enable_tidscan | on explain_pretty_print | on extra_float_digits | 0 from_collapse_limit | 8 fsync | on geqo | on geqo_effort | 5 geqo_generations | 0 geqo_pool_size | 0 geqo_selection_bias | 2 geqo_threshold | 12 integer_datetimes | off join_collapse_limit | 8 lc_collate | en_US.UTF-8 lc_ctype | en_US.UTF-8 lc_messages | en_US.UTF-8 lc_monetary | en_US.UTF-8 lc_numeric | en_US.UTF-8 lc_time | en_US.UTF-8 listen_addresses | * log_connections | off log_destination | stderr log_disconnections | off log_duration | off log_error_verbosity | default log_executor_stats | off log_hostname | off log_line_prefix | unset log_min_duration_statement | -1 log_min_error_statement | panic log_min_messages | notice log_parser_stats | off log_planner_stats | off log_rotation_age | 1440 log_rotation_size | 0 log_statement | none log_statement_stats | off log_truncate_on_rotation | on maintenance_work_mem | 65536 max_connections | 100 max_files_per_process | 1000 max_fsm_pages | 20000 max_fsm_relations | 1000 max_function_args | 32 max_identifier_length | 63 max_index_keys | 32 max_locks_per_transaction | 64 max_stack_depth | 2048 password_encryption | on port | 5432 pre_auth_delay | 0 random_page_cost | 4 redirect_stderr | on regex_flavor | advanced rendezvous_name | unset search_path | $user,public server_encoding | SQL_ASCII server_version | 8.0.3 shared_buffers | 32768 silent_mode | off sql_inheritance | on ssl | off statement_timeout | 0 stats_block_level | off stats_command_string | on stats_reset_on_server_start | on stats_row_level | off stats_start_collector | on superuser_reserved_connections | 2 syslog_facility | LOCAL0 syslog_ident | postgres TimeZone | PST8PDT trace_notify | off transaction_isolation | read committed transaction_read_only | off transform_null_equals | off unix_socket_group | unset unix_socket_permissions | 511 vacuum_cost_delay | 0 vacuum_cost_limit | 200 vacuum_cost_page_dirty | 20 vacuum_cost_page_hit | 1 vacuum_cost_page_miss | 10 wal_buffers | 8 wal_sync_method | fdatasync work_mem | 65536 zero_damaged_pages | off (128 rows)
---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match