We have a medium-sized catalog (about 5 million rows), but some of our customers only want to see portions of it. I've been experimenting with a customer-specific schema that contains nothing but a "join table" -- just the primary keys of that portion of the data that each customer wants to see, which is used to create a view that looks like the original table. But the most important query, the one that customers use to scan page-by-page through search results, turns out to be far too slow (65 seconds versus 55 milliseconds). Below are the results of two explain/analyze statements. The first one uses the view, the second one goes directly to the original tables. I thought this would be a slam-dunk, that it would return results in a flash because the view is created from two tables with the same primary keys. My guess (and it's just a wild guess) is that the "left join" is forcing a sequence scan or something. But we need the left join, because it's on a "hitlist" that recorded all the matches to a customer's earlier query, and if rows have been removed from the tables, the customer needs to see a blank row. Here is the "bad" query, which is run on the view: em=> explain analyze select version.version_id, version.isosmiles from hitlist_rows_reset_140 left join version on (hitlist_rows_reset_140.objectid = version.version_id) where hitlist_rows_reset_140.sortorder >= 1 and hitlist_rows_reset_140.sortorder <= 10 order by hitlist_rows_reset_140.sortorder; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------- ------------------------------ Nested Loop Left Join (cost=23687.51..215315.74 rows=1 width=54) (actual time=2682.662..63680.076 rows=10 loops=1) Join Filter: (hitlist_rows_reset_140.objectid = v.version_id) -> Index Scan using hitlist_rows_reset_140_pkey on hitlist_rows_reset_140 (cost=0.00..8.36 rows=1 width=8) (actual time= 0.015..0.049 rows=10 loops=1) Index Cond: ((sortorder >= 1) AND (sortorder <= 10)) -> Hash Join (cost=23687.51..204666.54 rows=851267 width=50) (actual time=31.829..6263.403 rows=851267 loops=10) Hash Cond: (v.version_id = mv.version_id) -> Seq Scan on version v (cost=0.00..116146.68 rows=5631968 width=50) (actual time=0.006..859.758 rows=5632191 loo ps=10) -> Hash (cost=13046.67..13046.67 rows=851267 width=4) (actual time=317.488..317.488 rows=851267 loops=1) -> Seq Scan on my_version mv (cost=0.00..13046.67 rows=851267 width=4) (actual time=2.888..115.166 rows=8512 67 loops=1) Total runtime: 63680.162 ms Here is the "good" query, which is run directly on the data tables. em=> explain analyze select registry.version.version_id, registry.version.isosmiles from hitlist_rows_reset_140 left join registry.version on (hitlist_rows_reset_140.objectid = registry.version.version_id) where hitlist_rows_reset_140.sortorder >= 1 and hitlist_rows_reset_140.sortorder <= 10 order by hitlist_rows_reset_140.SortOrder; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------- ------------------------------ Nested Loop Left Join (cost=0.00..17.73 rows=1 width=54) (actual time=36.022..55.558 rows=10 loops=1) -> Index Scan using hitlist_rows_reset_140_pkey on hitlist_rows_reset_140 (cost=0.00..8.36 rows=1 width=8) (actual time= 0.021..0.025 rows=10 loops=1) Index Cond: ((sortorder >= 1) AND (sortorder <= 10)) -> Index Scan using version_pkey on version (cost=0.00..9.35 rows=1 width=50) (actual time=5.551..5.552 rows=1 loops=10) Index Cond: (hitlist_rows_reset_140.objectid = version.version_id) Total runtime: 55.608 ms (6 rows) The view is defined like this: em=> \d my_version Table "test_schema.my_version" Column | Type | Modifiers ------------+---------+----------- version_id | integer | not null Indexes: "my_version_pkey" PRIMARY KEY, btree (version_id) em=> \d version View "test_schema.version" Column | Type | Modifiers ------------+---------+----------- version_id | integer | parent_id | integer | isosmiles | text | coord_2d | text | View definition: SELECT v.version_id, v.parent_id, v.isosmiles, v.coord_2d FROM registry.version v JOIN my_version mv ON mv.version_id = v.version_id; This is: Postgres 8.4.4 Ubuntu Linux 2.6.32-27 Database: 8x7200 RAID 10, LSI RAID controller with BBU WAL: 2x7200 RAID1 Non-default config parameters: max_connections = 500 shared_buffers = 1000MB work_mem = 128MB synchronous_commit = off full_page_writes = off wal_buffers = 256kB checkpoint_segments = 30 effective_cache_size = 4GB track_activities = on track_counts = off track_functions = none escape_string_warning = off Thanks, Craig -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance