On Tue, 11 Oct 2022, 22:07 Sengottaiyan T, <techsenko@xxxxxxxxx> wrote:
Hi All,
I'm looking for suggestions:
Environment: AWS PostgreSQL RDS instance - Version 14.3
Operations support gets intermittent alerts from the monitoring tool through AWS cloud watch metrics on Disk Queue Depth, CPU burst-credit & CPU Utilization.
I would like to understand what is causing the spike - is the number of logon's increased, (or) number of transactions per second increased, (or) SQL execution picked wrong plan and the long running (I/O, CPU or memory intensive) SQL is increasing load on server (cause and effect scenario) etc.,
Due to the reactive nature of the issues, we rely on the metrics gathered in the AWS cloud watch monitoring (for the underlying OS stats), Performance Insights (for the DB performance) and correlate SQL queries with pg_Stat_Statements view. But the data in the view is an aggregated stats. And, I'm looking to see the deltas compared to normal runs.
Performance Insights should also offer you visibility into statement level stats for Top SQL if pg_stat_statements is enabled.
Performance Insights also has other metrics (Counter Metrics) that you can refer to to understand some of the data points you are after - xact_count/second, session_in_idle_in_transactions/second, blocked_transactions/second etc. You need to add them to PI dashboard by using Manage Meteics button on PI dashboard.
How should I approach and get to the root-cause?
AppDynamics is already configured for the RDS instance. Are there any open source monitoring tools available which would help to capture and visualize the deltas?
Thanks,
Senko
Regards
Sameer
DB Specialist,
Amazon Web Services