Re: IO related waits

"Peter J. Holzer" <hjp-pgsql@xxxxxx> · Sat, 21 Sep 2024 15:19:31 +0200

On 2024-09-21 15:06:45 +0530, veem v wrote:
> On Sat, 21 Sept 2024 at 03:47, Peter J. Holzer <hjp-pgsql@xxxxxx> wrote:
> 
>     On 2024-09-20 14:11:38 -0700, Adrian Klaver wrote:
>     > On 9/20/24 1:01 PM, veem v wrote:
>     > > Able to reproduce this deadlock graph as below.  Now my
>     > > question is , this is a legitimate scenario in which the same
>     > > ID can get inserted from multiple sessions and in such cases
>     > > it's expected to skip that (thus "On conflict Do nothing" is
>     > > used) row. But as we see it's breaking the code
>     >
>     > Yeah, as I see it that would not work with concurrent
>     > uncommitted sessions as it would be unresolved whether a
>     > conflict actually exists until at least one of the sessions
>     > completes.
>     >
>     > > with deadlock error during race conditions where a lot of
>     > > parallel threads are operating. So how should we handle this
>     > > scenario? Will setting the "lock_timeout" parameter at session
>     > > level will help us anyway here?
[...]
>     SO that leads as to another solution:
> 
>     Retry each batch (possibly after reducing the batch size) until it
>     succeeds.
> 
> 
> 
> Actually here the application is using kafka and  flink stream and is
> using one of existing code in which it was doing row by row commit
> which is now changed to Batch commit i.e. the commit point is shifted
> from row by row to batch now. There are multiple sessions spawning at
> the same time to process the incoming messages 24/7. And also as
> described in another ongoing thread and also we saw in the past we did
> not see much difference between "batch commit" and "Batch insert"
> performance. We only realized the row by row commit is having
> worse performance.
> 
> Now, in such a scenario when the messages are coming from a streaming
> framework using kafka +flink and also the insert is happening using
> row by row only (but just the commit is shifted to batch), i don't see
> any way to sort the ID columns in this streaming process, so that they
> won't overlap across session.

I have never used Kafka or Flink, but I imagine that they can stream to
various targets, not just to PostgreSQL.

So I would write a program which can receive such a stream. This program
would then buffer rows until either a number of rows or some timeout was
exceeded. It could then preprocess those rows (e.g. by sorting them) and
then open a transaction and try to insert them. If the transaction fails
because of a deadlock, serialization error or similar, simply retry the
transaction. If everything went well, go back to receiving the next
batch of rows.

> In such a situation , the only way seems to have the messages replayed
> for which the deadlock error happens , as I think during a deadlock
> error, one session gets terminated by postgres

No, the *session* doesn't get terminated. The transaction fails. It can
be rolled back and tried again.

> and that messages perhap we can save in some exception table and then
> replay? 

Not sure what you mean by "exception table", but if you mean that you
keep all the rows from that transaction around until the commit succeeds
(so that you can try to insert them again if the transaction fails),
then yes, that's what I meant.

        hp

-- 
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@xxxxxx         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"
Attachment:
signature.asc

Description: PGP signature