Re: Efficient Partitioning Strategies for PostgreSQL Table with KSUID and High Volume

Lorusso Domenico <domenico.l76@xxxxxxxxx> · Wed, 6 Sep 2023 00:37:36 +0200

Hello,
Increase of access cost is logarithmic when we access by index, so partition often isn't a solution to improve performance, but a solution to solve human difficulties to manage huge amounts of data, to develop expensive parallel jobs and, in some cases, to improve performance for sequential reads.

If I understand correctly your need you have 2 topics:
- read by index --> no issue
- read by user_id --> could be fixed by an index, but, because the record are distributed, the access plan use multiple access by index

Some numbers.
each week table increases of 5 millions (5 * 10 ^6) record
after 10 year (520 weeks) table should have 2.6 billion of record
A btree could create a 64 based index that means you need of...5 access to retrieve the right index address plus 1 access to retrieve the row.

This doesn't seem to be a great problem.

But if you have just 100 users the plan calculation foresees to perform 2,6*10^9 / 10^2= 2.6*10^7 access by index and each access (that actually cost 6 access for each).

But probably you don't need to read all the records inserted or modified by a user_id.

An approach could be to create generational partitions.
Main partition stores just the most recent records (with some assumption you have to define), the oldest record could be moved to other historical partition(s).

This could be realize with a technical field where store value that lead the partition identification:
field_for_partition=current --> main partition
field_for_partition=h01 --> historical partition 1
field_for_partition=h02 --> historical partition 2
and so on.

Unfortunately, this approach requires to implement a cron job that each x days change the value of field_for_partition and perform e vacuum.

A  more simple solution is to create index with user_id + creation_date and ALWAYS access by user_id and creation_date > now() - interval 'xx days'

But, if you are users that insert record once each 2 months and users that insert 1000 record per day this approach could not fit your need.
Maybe you need also to store last insertion time for each user and also the number of record inserted.
In other word you could to find a way to define the right amount of days for each users

Il giorno mar 5 set 2023 alle ore 10:24 wheels <cheleon15@xxxxxxxxx> ha scritto:
Hi Team,

I'm working with a PostgreSQL table containing terabytes of data, and

it grows by millions of rows weekly. Each row is identified by a

[KSUID][1], and my primary read patterns are:

1. Retrieve a row by its KSUID.

2. List rows by `user_id` in descending order, pagination acceptable.

Currently, the table is unpartitioned and read performance is

sluggish. I'm contemplating partitioning the table by month using the

KSUID column, [leveraging its embedded uint32 timestamp][2], something

like this:

```sql

CREATE TABLE table_y2023m09 PARTITION OF ksuid

FOR VALUES FROM ('[current_month_ts][128 zeros]') TO

('[next_month_ts][128 zeros]')

```

This allows each 'Get row by KSUID' query to be isolated to a single partition.

For listing rows by `user_id`, I'm considering keyset pagination:

```sql

SELECT *

FROM table_name

WHERE user_id = ?

  AND ksuid > last_seen_ksuid

ORDER BY ksuid

LIMIT 10;

```

However, this method still would need to search through multiple

partitions depending on `last_seen_ksuid`, but I guess that with an

index by `user_id` might be enough.

### Questions:

1. Is using KSUID as a partitioning key viable, especially given that

the column can be represented as text or bytes?

2. Is there a more efficient way to implement listing by `user_id`

other than keyset pagination?

3. Are there any pitfalls or performance issues I should be aware of

with this approach?

4. Would it be better to just partition based on `created_at` and

extract the timestamp from the ksuid on application layer and add it

explicitly to the query?

Thank you very much,

Best regards.

  [1]: https://github.com/segmentio/ksuid

  [2]: https://github.com/segmentio/ksuid#:~:text=a%2032%2Dbit%20unsigned%20integer%20UTC%20timestamp

-- 
Domenico L.

per stupire mezz'ora basta un libro di storia,
io cercai di imparare la Treccani a memoria... [F.d.A.]