Thanks for replies. More detail and data below: Table: "lookup" uuid: type uuid. not null. plain storage. datetime_stamp: type bigint. not null. plain storage. harvest_date_stamp: type bigint. not null. plain storage. state: type smallint. not null. plain storage. Indexes: "lookup_pkey" PRIMARY KEY, btree (uuid) "lookup_32ff3898" btree (datetime_stamp) "lookup_6c8369bc" btree (harvest_date_stamp) "lookup_9ed39e2e" btree (state) Has OIDs: no Table: "article_data" int: type integer. not null default nextval('article_data_id_seq'::regclass). plain storage. title: text. text: text. insertion_date: date harvest_date: timestamp with time zone. uuid: uuid. Indexes: "article_data_pkey" PRIMARY KEY, btree (id) "article_data_uuid_key" UNIQUE CONSTRAINT, btree (uuid) Has OIDs: no Both lookup and article_data have around 65m rows. Two queries: (1) SELECT uuid FROM lookup WHERE state = 200 LIMIT 4000; OUTPUT FROM EXPLAIN (ANALYZE, BUFFERS): ------------------------------------------------ Limit (cost=0.00..4661.02 rows=4000 width=16) (actual time=0.009..1.036 rows=4000 loops=1) Buffers: shared hit=42 -> Seq Scan on lookup (cost=0.00..1482857.00 rows=1272559 width=16) (actual time=0.008..0.777 rows=4000 loops=1) Filter: (state = 200) Rows Removed by Filter: 410 Buffers: shared hit=42 Total runtime: 1.196 ms (7 rows) Question: Why does this do a sequence scan and not an index scan when there is a btree on state? (2) SELECT article_data.id, article_data.uuid, article_data.title, article_data.text FROM article_data WHERE uuid = ANY ('{f0d5e665-4f21-4337-a54b-cf0b4757db65,..... 3999 more uuid's ....}'::uuid[]); OUTPUT FROM EXPLAIN (ANALYZE, BUFFERS): ------------------------------------------------ Index Scan using article_data_uuid_key on article_data (cost=5.56..34277.00 rows=4000 width=581) (actual time=0.063..66029.031 rows=400 0 loops=1) Index Cond: (uuid = ANY ('{f0d5e665-4f21-4337-a54b-cf0b4757db65,5618754f-544b-4700-9d24-c364fd0ba4e9,958e37e3-6e6e-4b2a-b854-48e88ac1fdb7,ba56b483-59b2-4ae5-ae44-910401f3221b,aa4 aca60-a320-4ed3-b7b4-829e6ca63592,05f1c0b9-1f9b-4e1c-8f41-07545d694e6b,7aa4dee9-be17-49df-b0ca-d6e63b0dc023,e9037826-86c4-4bbc-a9d5-6977ff7458af,db5852bf-a447-4a1d-9673-ead2f7045589 ,6704d89 0b2-9ea9-390c8ed3cb2e,91cedfca-6b55-43e6-ae33-f2adf758ec78,e1b41c2f-31bb-4d29-9757-e7467ebb66c7,a9d3e6a9-5324-44e7-9cab-489bfb5ca081,ce9c2e64-b40e-48d7-b346-b9c76d79f192,26c3fcc5-cccb-4bc9-a5f5-806ead6fc859,2da9a3bc-0acb-41fd-b565-2a8a8662b85c,2097d61b-8d9b-4795-bd0d-c6db5a8e0501,d8841e46-0c1e-499b-804f-cb3fec3593b0,3ea98067-79ee-4497-b986-20cc09da6294,63046459-225f-4672-9db4-25b4491566e6,d45b2540-5835-43db-8e48-aa7b6613f8d4,df8720bf-9a2a-4550-9183-fd5e36e40485,c1c2cf05-c1d4-4f4c-8d8c-8b515d4ef24a,7233cc38-96ca-4e79-89ea-14c51e0e7ef4,76c6901d-496f-4c73-9d45-c934e46401f8,51673157-e2c6-4b89-bbcd-9aeda1750301,3de3f10f-da3d-4a96-90cd-fa3c9a02df01,9dbec983-23b8-4847-9c0e-030a8aee7ccc,7108ec74-91dc-47c6-a762-d860f0d56caa,eda38d3c-1231-47b8-ad19-28549fb4ec4c,401673a7-e5ca-4a47-9dea-5870dc69dbc8,649244dd-9a5b-48a7-88cf-ca2c7915de27,e9c8f789-3602-4e91-850e-eabc67269ecb,a55be381-bb34-4f2c-aede-8bab37cb479c,d101b8f1-389c-4613-b310-cd7d114dea8d,abce5c60-fa16-4d88-b844-ee3287aab777,e64e8b97-632d-45b8-9f4e-d83ef1717e77,f3a62745-6bcb-400b-b770-ac3c2fc91b81}'::uuid[])) Buffers: shared hit=16060 read=4084 dirtied=292 Total runtime: 66041.443 ms (4 rows) Question: Why is this so slow, even though it's reading from disk? On 24 March 2015 at 07:49, David Rowley <dgrowleyml@xxxxxxxxx> wrote: > On 21 March 2015 at 23:34, Roland Dunn <roland.dunn@xxxxxxxxx> wrote: >> >> >> If we did add more RAM, would it be the effective_cache_size setting >> that we would alter? Is there a way to force PG to load a particular >> table into RAM? If so, is it actually a good idea? > > > Have you had a look at EXPLAIN (ANALYZE, BUFFERS) for the query? > > Pay special attention to "Buffers: shared read=NNN" and "Buffers: shared > hit=NNN", if you're not reading any buffers between runs then the pages are > in the PostgreSQL shared buffers. By the looks of your config you have 10GB > of these. On the other hand if you're getting buffer reads, then they're > either coming from disk, or from the OS cache. PostgreSQL won't really know > the difference. > > If you're not getting any buffer reads and it's still slow, then the problem > is not I/O > > Just for fun... What happens if you stick the 50 UUIDs in some table, > analyze it, then perform a join between the 2 tables, using IN() or > EXISTS()... Is that any faster? > > Also how well does it perform with: set enable_bitmapscan = off; ? > > Regards > > David Rowley > > > -- Kind regards, Roland Roland Dunn -------------------------- m: +44 (0)7967 646 789 e: roland.dunn@xxxxxxxxx w: http://www.cloudshapes.co.uk/ https://twitter.com/roland_dunn http://uk.linkedin.com/in/rolanddunn -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance