On 11/10/15 3:42 PM, Jason Jho wrote:
On one particular day, this query hung for many hours and even while we killed pids for running queries and any locks granted, the query would never return. Also no hints of blocking processes. After some digging through some I/O metrics, we didn't see any memory issues or unusual spikes that would lead us to believe that we're running low on resources.
Did IO stats indicate IO was happening? Did you see a pegged CPU running the query?
There is 1 caveat, however: there was a different schema that contained a day-old copy of data that isn't normally present when the hang started to occur. However, since these are completely different schema namespaces with no crossovers in the queries themselves, I don't see how this is relevant.
If search_path wasn't what you thought it was you could have easily been running against the wrong set of tables.
We thought this might be possibly due to some internal vacuuming, but this is unlikely since there are no real concurrent reads or updates happening. Auto-vacuum is also on with default settings.
There are other reasons why autovacuum could kick in, notably to prevent transaction ID wraparound.
What is the most confusing part in all of this is why a DROP SCHEMA CASCADE and a fresh pg_restore would somehow fix the problem. Even a fresh reboot didn't fix it.
Without more info we're stuck guessing. You might try submitting a ticket with amazon, especially if you can reproduce this.
-- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance