Re: Temporarily very slow planning time after a big delete

David Rowley <david.rowley@xxxxxxxxxxxxxxx> · Tue, 21 May 2019 13:00:38 +1200

On Tue, 21 May 2019 at 12:44, Walter Smith <walter@xxxxxxxxxxxx> wrote:
>
> We had a mysterious (to us) slowdown today that I'm hoping someone can explain just based on PG's principles of operation. It got better by itself so it seems like it was "normal" behavior -- I just don't know what behavior it was exhibiting.
>
> We have a table of user notifications containing about 80 million rows. It gets a lot of inserts continually, and is cleaned up once a day. There are no updates. In all history there have been about 330 million rows created.
>
> Today we deleted about 15 million rows in one transaction from this table. Immediately afterwards, a particular SELECT started running very slowly -- 500 to 3000 ms rather than the usual <1ms.
>
> We did an EXPLAIN ANALYZE on this select and it was still doing an index scan as usual. The *planning time* for the query is what had gotten slow. The query itself was still executing in <1ms.

It would be good to know which version you're running here.  It
basically sounds very much like get_actual_variable_range() will be
the culprit.  Basically, if a constant value that's being used by the
planner to determine row estimates with falls outside the statistic's
histogram and a btree index exists that we can use to look up the
actual bound of the data, then we do so in that function.  If you've
just deleted a bunch of rows then that index scan may have to traverse
a bunch of dead tuples before finding that first live tuple.  This
code has changed a few times in recent times, see fccebe421 and
3ca930fc3, which is why your version is of interest.

Apart from that, if you want to confirm that's the issue and you just
want it fixed, just VACUUM the table. You should likely be doing that
anyway directly after your bulk delete.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services