On 9/16/24 20:55, veem v wrote:
On Tue, 17 Sept 2024 at 03:41, Adrian Klaver <adrian.klaver@xxxxxxxxxxx
<mailto:adrian.klaver@xxxxxxxxxxx>> wrote:
Are you referring to this?:
https://nightlies.apache.org/flink/flink-docs-release-1.20/docs/dev/datastream/operators/asyncio/ <https://nightlies.apache.org/flink/flink-docs-release-1.20/docs/dev/datastream/operators/asyncio/>
If not then you will need to be more specific.
Yes, I was referring to this one. So what can be the caveats in this
approach, considering transactions meant to be ACID compliant as
financial transactions.Additionally I was not aware of the parameter
"synchronous_commit" in DB side which will mimic the synchronous commit.
Would both of these mimic the same asynchronous behaviour and achieves
the same, which means the client data load throughput will increase
because the DB will not wait for those data to be written to the WAL and
give a confirmation back to the client and also the client will not wait
for the DB to give a confirmation back on the data to be persisted in
the DB or not?. Also, as in the backend the flushing of the WAL to the
disk has to happen anyway(just that it will be delayed now), so can this
method cause contention in the database storage side if the speed in
which the data gets ingested from the client is not getting written to
the disk , and if it can someway impact the data consistency for the
read queries?
This is not something that I am that familiar with. I suspect though
this is more complicated then you think. From the link above:
" Prerequisites #
As illustrated in the section above, implementing proper asynchronous
I/O to a database (or key/value store) requires a client to that
database that supports asynchronous requests. Many popular databases
offer such a client.
In the absence of such a client, one can try and turn a synchronous
client into a limited concurrent client by creating multiple clients and
handling the synchronous calls with a thread pool. However, this
approach is usually less efficient than a proper asynchronous client.
"
Which means you need to on Flink end:
1) Use Flink async I/O .
2) Find a client that supports async or fake it by using multiple
synchronous clients.
On Postgres end there is this:
https://www.postgresql.org/docs/current/wal-async-commit.html
That will return a success signal to the client quicker if
synchronous_commit is set to off. Though the point of the Flink async
I/O is not to wait for the response before moving on, so I am not sure
how much synchronous_commit = off would help.
--
Adrian Klaver
adrian.klaver@xxxxxxxxxxx