Re: Postgresql 14 partitioning advice

Nathan Ward <lists+postgresql@xxxxxxxxx> · Sat, 30 Jul 2022 19:48:35 +1200

On 30/07/2022, at 9:44 AM, Rick Otten <rottenwindfish@xxxxxxxxx> wrote:

On Wed, Jul 27, 2022 at 8:55 AM Rick Otten <rottenwindfish@xxxxxxxxx> wrote:
I'm spinning up a new Postgresql 14 database where I'll have to store a couple years worth of time series data at the rate of single-digit millions of rows per day.  Since this has to run in AWS Aurora, I can't use TimescaleDB.

I thought I'd report back some of my findings from testing this week:

I took the same real world two week data set and created identical tables except that I partitioned one by month, one by week, one by day, and one by hour.  I partitioned a little bit into the past and a little bit into the future. I did this on a PG 14.2 RDS instance.  This gave me tables with:
3 partitions, 13 partitions, 90 partitions and 2136 partitions, but otherwise the same data.

Insert times were equivalent.

Then I crafted a query that was one of the main use cases for the data and ran it a bunch of times.

I noticed a significant degradation in performance as the number of partitions increased.  The jump from 13 to 90, in particular, was very steep.  It didn't matter what I set work_mem or other tunables to.  I dug deeper...

Surprising to me was if you partition on a `timestamp with timezone` column, call it "ts":
If your where clause looks like
```
where ts at time zone 'UTC' > '2022-07-01 00:00'::timestamp
```
you will NOT get partition pruning and it will sequence scan.
However if you change it to (with an appropriately adjusted right hand side if necessary):
```
where ts > '2022-07-01 00:00'::timestamp
```
It will do partition pruning and will index scan.

When I made that change the query performance was equivalent regardless of which number of partitions I had in play.
I did a quick test and this happens on a regular timestamp index on a regular table as well.

The other problem I ran into, which I'm still building a test case for and I fear might be a bug if I can easily reproduce it,
is if I did the original select in a CTE, and then did a sort outside of the CTE, even though the CTE found 0 rows, the database
still spent a _ton_ of time sorting those 0 rows:
```
   ->  Sort  (cost=70.03..72.53 rows=1000 width=112) (actual time=84848.452..84848.453 rows=0 loops=1)
```
Once I can reproduce this on test data I'll be able to pin down more closely what is happening and tell if I'm just reading
the explain plan wrong or if something is broken.  It was getting mixed up with the lack of pruning/index usage problem.

I'll report back again next week.  Anyway it is looking to me like it doesn't really matter (within reason) from a performance
perspective how many partitions we use for our data set and query patterns.  We should be able to pick the most convenient
from an archiving and data management perspective instead.

Hi Rick,

I am working with data with a similar structure. The most recent data is accessed significantly more often than older data, so my next step will be to have very recent data in hourly tables, then daily, and probably monthly tables for what is effectively archived data. My data is a little different in that it’s stored by the start of an “interval” which means I can do a = comparison for the start of the hour (or day once data is aggregated).

I found a similar interesting partitioning performance issue recently when partitioning by timestamp, where if your where clause for a timestamp includes math the planner runs very slowly. In my case saying something like:
```
select * from table where ts = some_time_variable - interval ‘1 hour’;
```
is *much* slower than something like:
```
offset_time_variable = some_time_variable - '1 hour’ interval;
select * from table where ts = offset_time_variable;
```

Everything is `timestamp with time zone`.
I believe that it’s calculating that offset for each partition - of which there are a couple hundred - and it was causing the planner to run very slowly. Pruning works correctly once the planner has run.

This is on postgres 13 - I have yet to try 14 and see if this issue persists in 14.

Changing my main query to the above structure significantly improved performance - I was previously having lots of performance issues when aggregation tasks ran and dropped partitions etc.

I posted about this here: https://www.postgresql.org/message-id/84101021-8B67-45AD-83F2-A3C8F0AA4BEE%40daork.net

--
Nathan Ward