Search Postgresql Archives

Partitioning a table by integer value (preferably in place)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,


Linux Fedora 34
1TB Samsung SSD
4 CPUs, 2 cores

PostgreSQL 12.7 (can upgrade if a better solution is to be found in 13
or even 14 beta2 - currently testing a proposed solution, so by the
time it's fully implemented, 14 should be on GA and hey, I might even
find a bug or two to help with  the project!).

I have a 400GB joining table (one SMALLINT and the other INTEGER -
Primary Keys on other tables) with 1000 fields on one side and 10M on
the other, so 10,000M (or 10Bn) records all told.

What I would like to do is to partition by the SMALLINT (1 - 1000)
value - which would give 1,000 tables of 400MB each.

I wish to avoid having to do this manually 1,000 times - is there a
way of telling PostgreSQL to partition by value without specifying the
SMALLINT value each time?

I looked here:

https://www.enterprisedb.com/postgres-tutorials/how-use-table-partitioning-scale-postgresql

and there is LIST, RANGE and HASH partitioning.

I think a RANGE of 1 would be what I want here?

So, questions;

Is 1,000 partitions reasonable? This:

https://elephas.io/is-there-a-limit-on-number-of-partitions-handled-by-postgres/

appears to suggest that it shouldn't be a problem?


Could I go with a RANGE of, say, 10 values per partition? If I have to
explicitly code, I'd prefer this for my test - at least it would save
on the typing! :-)

This would product 100 tables of ~ 4GB each. Would I see much
performance degradation with a 4GB table on an SSD?

Finally, the icing on the cake would be if this could be done in place
- my SSD is 1TB and the output from df -h is:

test=# \! df -h
Filesystem                               Size  Used Avail Use% Mounted on
devtmpfs                                  16G     0   16G   0% /dev
tmpfs                                     16G  212K   16G   1% /dev/shm
tmpfs                                    6.3G  1.8M  6.3G   1% /run
/dev/mapper/fedora_localhost--live-root   69G   11G   55G  17% /
tmpfs                                     16G  284K   16G   1% /tmp
/dev/sda5                                976M  192M  718M  22% /boot
/dev/mapper/fedora_localhost--live-home  1.3T  898G  270G  77% /home
/dev/sda2                                 96M   52M   45M  54% /boot/efi
tmpfs                                    3.2G   96K  3.2G   1% /run/user/1000
test=#


So, I only have 270 GB left on disk - and it took > 12 Hrs to fill it
with indexes and Foreign Key constraints so I'd like to be able to do
it without having to go through that again.

This:

https://www.2ndquadrant.com/en/blog/partitioning-a-large-table-without-a-long-running-lock/

appears to suggest that it can be done online. There will be no other
activity on the table while any partitioning &c. will be ongoing.
However, the article makes no mention of space considerations.

This is my first time considering partitioning, so I'd be grateful for
any advice, pointers, references, URLs &c.... and please let me know
if I"ve left out any important information.


TIA and rgs,


Pól...


Some (relevant?) settings;

max_worker_processes = 4
max_parallel_workers_per_gather = 2
max_parallel_workers = 4
max_parallel_maintenance_workers = 2

as suggested by pgtune. Is pgtune a good bet for configuration suggestions?






[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux