Since this is an upsert, the staging table simply accepts a copy of pipe-delimited data via a COPY FROM STDIN, and a subsequent SQL script performs the UPDATE/INSERT. The staging table is then truncated (or delete without where) for the next run: Truncate staging, COPY into staging, update then insert destination from staging, repeat on next 5 minute clock cycle.
Since the staging table is fully loaded into the destination table, a full scan was not a concern, but the lack of index usage on the destination. For example, this is normally supported by the PK index, and performs well:
INSERT INTO destination (…)
SELECT (…) FROM staging
WHERE NOT EXISTS (SELECT * FROM source WHERE source.id = destination.id)
The above runs in expected time when the explain plan shows an index scan on the destination primary key index, and a seq scan on the staging table. This will continue for many runs, until something causes the engine to stop using the destination PK index, and begin scanning both the destination and staging. What triggers this is the interesting question.The only system event I can (loosely) correlate with the problem is the start of a nightly pg_dump, but as a read-only process, this would not be changing any index stats, correct?
An ANALYZE on staging is possible after each load cycle, but time does not permit on the destination table. I have been focusing on the destination because it has the index that is not being used. Will stats on the staging table affect index selection on the destination in a case like this?
In the process of attempting to diagnose this, both tables involved have been vacuumed (full), analyzed. I have also moved staging to an SSD volume, and created an equivalent index on staging - which is not used in any plan, nor do I expect to as there is no filter criteria on staging, and the index maintenance on staging would seem to be an unneeded overhead. But in this case, is there any advantage to an index on staging?
For reference, it is possible (not enough test cycles to verify) that left anti-join makes this situation worse, even though the explain plans appear identical:
INSERT INTO destination (…)
SELECT (…) FROM staging
LEFT JOIN destination ON destination.id = staging.id
WHERE destination.id IS NULL
On 29 Oct 2013, at 9:45, Tom Lane wrote:
> Matt <bsg075@xxxxxxxxx> writes:
>> In most cases, EXPLAIN and runtime tell me the index is utilized. However,
>> sometime on back to back runs (5 minute intervals) the runtime explodes and
>> EXPLAIN shows me that the PK index is not used, as both the staging table
>> and destination table a sequentially scanned.
>
> You haven't explained what you do with the "staging table", but maybe you
> need to ANALYZE it after you've loaded fresh data into it. The stats for
> the main table are presumably somewhat stable, but the stats for the
> staging table maybe not. The apparent correlation to consecutive runs
> lends some support to this theory, as that would reduce the time window
> for auto-ANALYZE to fix it for you.
>
> regards, tom lane