Thanks everyone, I decided to have the SQS process changed to create csv files in a S3 bucket. Then we have a process that will use the copy command to load the data. Process is loading 500,000 records in around 4 minutes which should be good enough for now. Going to look at pg_citus to get up to speed on postgres partitioning for a future need. From: Scott Marlowe <scott.marlowe@xxxxxxxxx>
Sent: Tuesday, April 18, 2017 3:41 PM To: ROBERT PRICE Cc: pgsql-performance@xxxxxxxxxxxxxx Subject: Re: Insert Concurrency On Mon, Apr 17, 2017 at 8:55 PM, ROBERT PRICE <rprice504@xxxxxxxxxxx> wrote:
> I come from an Oracle background and am porting an application to postgres. > App has a table that will contain 100 million rows and has to be loaded by a > process that reads messages off a SQS queue and makes web service calls to > insert records one row at a time in a postgres RDS instance. I know slow by > slow is not the ideal approach but I was wondering if postgres had > partitioning or other ways to tune concurrent insert statements. Process > will run 50 - 100 concurrent threads. It's not uncommon to look for an Oracle solution while working with another rdbms. Often what works in one engine doesn't work the same or as well in another. Is it possible for you to roll up some of these inserts into a single transaction in some way? Even inserting ten rows at a time instead of one at a time can make a big difference in your insert rate. Being able to roll up 100 or more together even more so. Another possibility is to insert them into a smaller table, then have a process every so often come along, and insert all the rows there and then delete them or truncate the table (for truncate you'll need to lock the table to not lose rows). -- To understand recursion, one must first understand recursion. |