Re: Most effective and fast way to load few Tbyte of data from flat files into postgresql

Shaozhong SHI <shishaozhong@xxxxxxxxx> · Thu, 27 Aug 2020 20:13:44 +0100

On Tue, 25 Aug 2020 at 12:24, Peter J. Holzer <hjp-pgsql@xxxxxx> wrote:
On 2020-08-24 21:17:36 +0000, Dirk Krautschick wrote:

> what would be the fastest or most effective way to load few (5-10) TB

> of data from flat files into a postgresql database, including some 1TB

> tables and blobs?

> 

> There is the copy command but there is no way for native parallelism,

> right? I have found pg_bulkload but haven't tested it yet. As far I

> can see EDB has its EDB*Loader as a commercial option.

A single COPY isn't parallel, but you can run several of them in

parallel (that's what pg_restore -j N does). So the total time may be

dominated by your largest table (or I/O bandwidth).

        hp

-- 

   _  | Peter J. Holzer    | Story must make more sense than reality.

|_|_) |                    |

| |   | hjp@xxxxxx         |    -- Charles Stross, "Creative writing

__/   | http://www.hjp.at/ |       challenge!"

This topic is interesting.  Any examples for parallel copy?

Regards,

SS