Re: reindex option for tuning load large data

Vitalii Tymchyshyn <vit@xxxxxx> · Fri, 17 Jun 2022 20:48:35 -0700

I believe you should be able to use reindexdb with parallel jobs:https://www.postgresql.org/docs/13/app-reindexdb.html
It will still create multiple connections, but you won't need to run multiple commands.

чт, 16 черв. 2022 р. о 22:34 James Pang (chaolpan) <chaolpan@xxxxxxxxx> пише:

Hi ,
  We plan to migrate large database from Oracle to Postgres(version 13.6, OS Redhat8 Enterprise), we are checking options to make data load in Postgres fast. Data volume is about several TB,  thousands of indexes,  many large table with
 partitions.  We want to make data load running fast and avoid miss any indexes when reindexing. There are 2 options about reindex. Could you give some suggestions about the 2 options, which option is better.

Create tables and indexes( empty database) ,   update pg_index set indisready=false and inisvalid=false,  then  load data use COPY from csv ,  then reindex table …
Reindex on Postgres 13.6 not support parallel ,right?  So we need to start multiple session to reindex  multiple tables/indexes in parallel.

2).  Use pg_dump to dump meta data only , then copy “CREATE INDEX … sql “

        Drop indexes before data load 
       After data load, increase 
max_parallel_maintenance_workers, maintenance_work_mem
       Run CREATE INDEX … sql  to leverage parallel create index feature.

Thanks,

James