Re: How batch processing works

Lok P <loknath.73@xxxxxxxxx> · Sat, 21 Sep 2024 16:44:08 +0530

On Sat, Sep 21, 2024 at 9:51 AM Michał Kłeczek <michal@xxxxxxxxxxx> wrote:
Hi,

> On 19 Sep 2024, at 07:30, Lok P <loknath.73@xxxxxxxxx> wrote:

> 

[snip]

> 

> Method-4

> 

> INSERT INTO parent_table VALUES  (1, 'a'), (2, 'a');

> INSERT INTO child_table VALUES   (1,1, 'a'), (1,2, 'a');

> commit;

I’ve done some batch processing of JSON messages from Kafka in Java.

By far the most performant way was to:

1. Use prepared statements

2. Parse JSON messages in Postgres

3. Process messages in batches

All three can be achieved by using arrays to pass batches:

WITH parsed AS (

  SELECT msg::json FROM unnest(?)

),

parents AS (

  INSERT INTO parent SELECT … FROM parsed RETURNING ...

)

INSERT INTO child SELECT … FROM parsed…

Not the single parameter that you can bind to String[]

Hope that helps.

Got your point.
But wondering why we don't see any difference in performance between method-2 and method-3 above. So does it mean that,I am testing this in a wrong way or it's the expected behaviour and thus there is no meaning in converting the row by row inserts into a bulk insert, but just changing the commit frequency will do the same job in a row by row insert approach?