Hi,
We're running PostgreSQL as essentially a data warehouse, and we have a few thousand roles, which are used to grant permissions on a table-by-table basis to a few thousand users, so a user would typically have say between 1 and 2 thousand roles. There is also quite a lot of "churn" in terms of tables being created/removed, and permissions changed.
The issue is that we're hitting a strange performance problem on connection. Sometimes it can take ~25 to 40 seconds just to connect, although it's often way quicker. There seems to be no middle ground - never have I seen a connection take between 0.5 and 25 seconds for example. We suspect it's related to the number of roles the connecting user has (including via other roles), because if we remove all roles but one from the connecting user (the one that grants connection permissions), connecting is always virtually instantaneous.
Is there anything we can do to investigate (or hopefully fix!) the issue?
Thanks,
Michal
------
A description of what you are trying to achieve and what results you expect.:
We would like to connect to the database - expect it to connect in less than 1 second, but sometimes 25 - 40s.
PostgreSQL version number you are running:
PostgreSQL 14.10 on aarch64-unknown-linux-gnu, compiled by aarch64-unknown-linux-gnu-gcc (GCC) 9.5.0, 64-bit
How you installed PostgreSQL:
Via AWS/Amazon Aurora
Changes made to the settings in the postgresql.conf file
In attached CSV file
Operating system and version:
Unknown
What program you're using to connect to PostgreSQL:
Python + SQLAlchemy, psql, or also via Amazon Quicksight (Unsure which client they use under the hood, but it surfaces connection timeout errors, which we suspect is due to the issue described above)
Is there anything relevant or unusual in the PostgreSQL server logs?:
No
For questions about any kind of error:
N/A