Re: Postgresql 13 partitioning advice

David Rowley <dgrowleyml@xxxxxxxxx> · Tue, 2 Aug 2022 22:16:21 +1200

On Tue, 2 Aug 2022 at 19:48, Ameya Bidwalkar
<bidwalkar.ameya10@xxxxxxxxx> wrote:
> We have a Postgresql 13 database where we have a single table with several millions of rows  . We plan to partition it based on timestamp .
> We have  been seeking  advice for best practices for building this.
> This table will get lots of updates for the same rows during a short period of time.During this time rows would be in a single partition .
> After this short time these rows would move to  another partition  .Where no more updates take place on these rows.But might  have some SELECT queries  running.
> We plan to l have  partitions based on months and then roll them up in a year  and then archive these older partitions
> One consultant we talked with told us this  row movement between the partitions will have
> huge complications .But  this was an issue during the Postgres 10 version .

Define "huge complications"?

The capabilities of partitioned tables have changed quite a bit since
the feature was added. It's very easy for knowledge to get out-dated
in this area.  I did quite a bit of work on them and I struggle to
remember off the top of my head which versions saw which improvements.
PG12 saw lots. See [1], search for "partition".

One possible complication is what is mentioned in [2] about
"serialization failure error".  UPDATEs that cause a tuple to move to
another partition can cause a serialization failure at transaction
isolation level, not just serializable transactions. If it's not
already, you might want to have your application retry transactions on
SQL:40001 errors.

Apart from that, assuming there's comparatively a small number of rows
in the partition being updated compared to the partition with the
static rows, then it sounds fairly efficient. As you describe it,  the
larger static partition is effectively INSERT only and auto-vacuum
will need to touch it only for tuple freezing work.  The smaller of
the two tables will receive more churn but will be faster to vacuum.
PG13 got a new feature that makes sure auto-vacuum also does the
rounds on INSERT-only tables too, so the static partition is not going
to be neglected until anti-wrap-around-autovacuums trigger, like they
would have in PG12 and earlier.

Another thing to consider is that an UPDATE of a non-partitioned table
has a chance at being a HOT update. That's possible if the tuple can
fit on the same page and does not update any of the indexed columns. A
HOT update means no indexes need to be updated so these perform faster
and require less space in WAL than a non-HOT update.  An UPDATE that
moves a tuple to another partition can never be a HOT update.  That's
something you might want to consider. If you're updating indexed
columns already then it's not a factor to consider.    There's also
overhead to postgres having to find the partition for the newly
updated version of the tuple.  That's not hugely expensive, but it's
generally measurable. RANGE partitioned tables with a large number of
partitions will have the most overhead for this. HASH partitioned
tables, the least.

The best thing you can likely do is set up a scenario with pgbench and
compare the performance. pgbench is a pretty flexible tool that will
allow you to run certain queries X% of the time and even throttle the
workload at what you expect your production server to experience. You
could then run it overnight on a test server, or even for weeks and
see how auto-vacuum keeps up when compared to the non-partitioned
case. You can also check how much extra WAL is generated vs the
non-partitioned case.

> So we are seeking advice on the performance perspective and things we should take care of along with manual vacuums on a regular schedule and indexing.
> Are there any tunables I should experiment with in particular ?

Perhaps if you want to keep a small high-chun table in check you might
want to consider if autovacuum_naptime is set low enough. You may not
care if the space being consumed in the standard 1min
autovacuum_naptime is small enough not to be of concern.

David

[1] https://www.postgresql.org/docs/release/12.0/
[2] https://www.postgresql.org/docs/13/sql-update.html