Hi All,
A basic question on handling large number of concurrent requests on DB.
I have a cloud service which can get large of requests which will obviously trigger the db operations.
Every db will have some max connection limit which can get exhausted on large number of requests.
I know db connection pooling can be used to reuse the connections but it will not help when there are large number of active concurrent connections. My queries are already optimised and short living.
For that i need some queuing mechanism like pgbouncer for postgres https://www.percona.com/blog/2021/02/26/connection-queuing-in-pgbouncer-is-it-a-magical-remedy/
pgbounder i understand is a proxy which needs to be separately installed on the web or db server.
I was thinking if the normal client side db connection pooling libraries like Apache DBCP , can also provide similar connection queuing while running in the application runtime.
From what i see here, this seems to be similar to other client side connection libraries/ poolers, which works, but the major benefit from pgbouncer for server side connection
pooling is something like transaction level pooling. which would work pretty well in your case, if you say you have short lived connections.
creating a new connection on postgresql directly is expensive (resources) as it is spawning a new process.
pgbouncer provides a workaround, where it manages opening and closing of connections on the server side for you, and then allocating the already opened connections to the frontend connections. when your frontend connection is done with the transaction, it will close the frontend connection, but does not do it at the backend "yet". if there is a new request for a connection, it will reuse the old connection, and allocate the same to the new connection.
Hence the caveat, you cannot use prepared statements and there is no guarantee which backend connection gets mapped to which frontend connection.
also, it is important to note, having too many connections while in a transaction can result in bloating, as due to mvcc, it will not do any dead rows cleanup even if it can as it cannot see those "dead" rows if any,
that can result in bloating and eventually slowdown and other problems.
The best use case I have seen with pgbouncer is not just connection pooling, but also being able to handle auto scaling to a large effect. Especially with microservices, if you app has a connection pool of 10, and you scale your app to 10 instances, without pgbouncer it would use all 100 backend connections, but if the connections are short lived transactions, it might work well even with 50 connections in the backend.
there are other administrative benefits of pgbouncer too, beyond just connection pooling,
The biggest drawback I have had with pgbouncer with enterprise auth support like for ldap/kerberos etc and it masks the ip from the client with its own ip when you look at pg_stat_activity.
but since we use one db per app, we are able to handle the above issues with less unknowns.