On Mon, 2021-12-06 at 10:18 -0700, Michael Lewis wrote: > What is your default_statistics_target and how accurate is that > estimate of 5668 rows? What is random_page_cost set to by the way? > > > default_statistics_target = 1000 random_page_cost = 2.0 (it's on AWS on a 9000 iops gp2 volume) Postgresql 13.5 btw. The estimate was reasonably accurate, there were 5069 actual rows matching. > More importantly, what is the better plan that you'd like the planner > to use with your existing indexes? Well, it takes a few ms to grab all 5000 rows by shipment_import_id and then sort/limit them. It takes 30 seconds to do what it is doing instead, and only when the table is mostly cached already, more like 4-5 minutes otherwise. #explain analyze SELECT "shipment_import_records".* FROM shipment_import_records" WHERE shipment_import_records"."shipment_import_id" = 5090609 ORDER BY "shipment_import_records"."id" ASC LIMIT 1; --------------------------------------------------------------------- --------------------------------------------------------------------- ------------------------------------- Limit (cost=0.44..873.08 rows=1 width=243) (actual time=31689.725..31689.726 rows=1 loops=1) -> Index Scan using shipment_import_records_pkey on shipment_import_records (cost=0.44..5122405.71 rows=5870 width=243) (actual time=31689.723..31689.724 rows=1 loops=1) Filter: (shipment_import_id = 5090609) Rows Removed by Filter: 28710802 Planning Time: 0.994 ms Execution Time: 31689.744 ms (6 rows) Just with a kludge to force the better index: # explain analyze SELECT * FROM (SELECT "shipment_import_records".* FROM "shipment_import_records" WHERE "shipment_import_records"."shipment_import_id" = 5090609 OFFSET 0) AS x ORDER BY "id" ASC LIMIT 1; QUERY PLAN --------------------------------------------------------------------- --------------------------------------------------------------------- ------------------ Limit (cost=10655.34..10655.34 rows=1 width=243) (actual time=4.868..4.869 rows=1 loops=1) -> Sort (cost=10655.34..10670.02 rows=5870 width=243) (actual time=4.867..4.868 rows=1 loops=1) Sort Key: shipment_import_records.id Sort Method: top-N heapsort Memory: 27kB -> Index Scan using index_shipment_import_records_on_shipment_import_id on shipment_import_records (cost=0.44..10567.29 rows=5870 width=243) (actual time=0.037..3.560 rows=5069 loops=1) Index Cond: (shipment_import_id = 5090609) Planning Time: 0.135 ms Execution Time: 4.885 ms (8 rows) > > Certainly a composite index would be very helpful here. Using explain > analyze and sharing the output would give more info to go on. > Yeah I am going to just do the composite index for now, but was hoping for a more generic option. Thanks for looking at it.