> On Mar 23, 2020, at 5:59 AM, Andrei Zhidenkov <andrei.zhidenkov@xxxxxxx> wrote: > > Try to write a stored procedure (probably pl/python) that will accept an array of JSON objects so it will be possible to load data in chunks (by 100-1000 files) which should be faster. > >>> On 23. Mar 2020, at 12:49, Ertan Küçükoğlu <ertan.kucukoglu@xxxxxxxxxxx> wrote: >>> >>> >>>> On 23 Mar 2020, at 13:20, pinker <pinker@xxxxxxx> wrote: >>> >>> Hi, do you have maybe idea how to make loading process faster? >>> >>> I have 500 millions of json files (1 json per file) that I need to load to >>> db. >>> My test set is "only" 1 million files. >>> >>> What I came up with now is: >>> >>> time for i in datafiles/*; do >>> psql -c "\copy json_parts(json_data) FROM $i"& >>> done >>> >>> which is the fastest so far. But it's not what i expect. Loading 1m of data >>> takes me ~3h so loading 500 times more is just unacceptable. >>> >>> some facts: >>> * the target db is on cloud so there is no option to do tricks like turning >>> fsync off >>> * version postgres 11 >>> * i can spin up huge postgres instance if necessary in terms of cpu/ram >>> * i tried already hash partitioning (to write to 10 different tables instead >>> of 1) >>> >>> >>> Any ideas? >> Hello, >> >> I may not be knowledge enough to answer your question. >> >> However, if possible, you may think of using a local physical computer to do all uploading and after do backup/restore on cloud system. >> >> Compressed backup will be far less internet traffic compared to direct data inserts. >> >> Moreover you can do additional tricks as you mentioned. >> >> Thanks & regards, >> Ertan >> >> Drop any and all indices >> >> > > >