On 10/27/16 3:46 PM, Craig James wrote:
Limit (cost=3264.63..7193.14 rows=1 width=4) -> Nested Loop (cost=3264.63..428658697.57 rows=109114 width=4) Join Filter: (rv.version_id = sample.version_id) -> Index Only Scan Backward using version_pkey on version rv (cost=0.42..6812.85 rows=261895 width=4) -> Materialize (cost=3264.21..5992.06 rows=109114 width=4) -> HashAggregate (cost=3264.21..4355.35 rows=109114 width=4) -> Seq Scan on sample (cost=0.00..2991.37 rows=109137 width=4) Why would this trivial query run forever at 100% CPU?
My bet is that there's a lot of rows in version that have a higher version than what's in sample. That means a lot of repeated scans through the tuplestore underneath the Materialize node.
If you can remove duplicates from sample and get rid of the DISTINCT, this will probably get a better plan. If you can't do that then you could try changing the JOIN to an IN:
SELECT ... FROM sample WHERE version_id IN (SELECT version_id FROM sample) -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance