Re: Postgresql 14 partitioning advice

Rick Otten <rottenwindfish@xxxxxxxxx> · Tue, 2 Aug 2022 08:54:46 -0400

On Mon, Aug 1, 2022 at 10:16 AM Rick Otten <rottenwindfish@xxxxxxxxx> wrote:

The other problem I ran into, which I'm still building a test case for and I fear might be a bug if I can easily reproduce it,
is if I did the original select in a CTE, and then did a sort outside of the CTE, even though the CTE found 0 rows, the database
still spent a _ton_ of time sorting those 0 rows:
```
   ->  Sort  (cost=70.03..72.53 rows=1000 width=112) (actual time=84848.452..84848.453 rows=0 loops=1)
```
Once I can reproduce this on test data I'll be able to pin down more closely what is happening and tell if I'm just reading
the explain plan wrong or if something is broken.  It was getting mixed up with the lack of pruning/index usage problem.

I'll report back again next week.  Anyway it is looking to me like it doesn't really matter (within reason) from a performance
perspective how many partitions we use for our data set and query patterns.  We should be able to pick the most convenient
from an archiving and data management perspective instead.

This behavior is definitely consistent.  0 rows end up slower than when I find some rows in my CTE:
```
   ->  Sort  (cost=109.44..113.19 rows=1500 width=112) (actual time=87110.841..87110.842 rows=0 loops=1)
   ->  Sort  (cost=109.44..113.19 rows=1500 width=112) (actual time=25367.867..25367.930 rows=840 loops=1)
```
The only thing I changed in the query was the date range.  It is actually the CTE scan step inside the Sort block that is slower when no rows are returned than when rows are returned.  It also only happens when all the partitions are sequence scanned instead of being partition pruned.

I'm still writing up a test case that can demo this without using proprietary data.

After a bunch of experiments I can explain this now.  :-)

I had a `limit` clause in my test CTE.  When sequence scanning a bunch of partitions, if the limit is reached, the subsequent partitions are marked with `never executed` and not scanned.  On the other hand, when no rows are found, all of the partitions are scanned.

Therefore, with many millions of rows in the partitions, and being forced to sequence scan because I put the `at time zone` clause in the `where`, the case when rows are found is always noticeably faster than the case when rows aren't found as long as at least one partition hasn't been scanned yet when the limit is hit.

I'm now satisfied this is a good thing, and will move on to other problems.  Thanks for hearing me out.  I was scratching my head for a while over that one.