On Sat, Jun 8, 2024 at 12:53 PM Lok P <loknath.73@xxxxxxxxx> wrote:
Hello,We have a few tables having size ~5TB and are partitioned on a timestamp column. They have ~90 partitions in them and are storing 90 days of data. We want to create a couple of indexes on those tables. They are getting the incoming transactions(mainly inserts) 24/7 , which are mostly happening on the current day/live partition. Its RDS postgres version 15.4. So in this situationShould we go with below i.e one time create index command on the table..
CREATE INDEX CONCURRENTLY idx1 ON tab(column_name);
Or
create index on individual partitions from different sessions, say for example create indexes on 30 partitions each from three different sessions so as to finish all the 90 partitions faster?
CREATE INDEX CONCURRENTLY idx1 ON tab_part1(column_name);
CREATE INDEX CONCURRENTLY idx1 ON tab_part2(column_name);..........
Basically I have three questions:
1)If we can do this index creation activity online without impacting the incoming transactions or do we have to take down time for this activity?
2)If we can't do it online then , what is the fastest method to do this index creation activity ?3)Should we change the DB parameters in a certain way to make the process faster? We have currently set below parameters
max_parallel_workers-16
max_parallel_maintenance_workers-2
maintenance_work_mem- 4GB
You can first create the index on the table using the "On ONLY"keyword, something as below.
CREATE INDEX idx ON ONLY tab(col1);
Then create indexes on each partition in "concurrently" from multiple sessions in chunks.
CREATE INDEX CONCURRENTLY idx_1 ON tab_part1(col1);
CREATE INDEX CONCURRENTLY idx_2 ON tab_part2(col1);
CREATE INDEX CONCURRENTLY idx_2 ON tab_part2(col1);
After this step finishes the table level index which was created in the first step will be in valid state automatically.