If those 50-100 connections are all active at once, yes, that is high. They can easily spend more time fighting each other over LWLocks, spinlocks, or cachelines rather than doing useful work. This can be exacerbated when you have multiple sockets rather than all cores in a single socket. And these problems are likely to present as high Sys times.Perhaps you can put up a connection pooler which will allow 100 connections to all think they are connected at once, but forces only 12 or so to actually be active at one time, making the others transparently queue.
Can you expound on this or refer me to someplace to read up on this?
Context, I don't want to thread jack though: I think I am seeing similar behavior in our environment at times with queries that normally take seconds taking 5+ minutes at times of high load. I see many queries showing buffer_mapping as the LwLock type in snapshots but don't know if that may be expected. In our environment PgBouncer will accept several hundred connections and allow up to 100 at a time to be active on the database which are VMs with ~16 CPUs allocated (some more, some less, multi-tenant and manually sharded). It sounds like you are advocating for connection max very close to the number of cores. I'd like to better understand the pros/cons of that decision.
Context, I don't want to thread jack though: I think I am seeing similar behavior in our environment at times with queries that normally take seconds taking 5+ minutes at times of high load. I see many queries showing buffer_mapping as the LwLock type in snapshots but don't know if that may be expected. In our environment PgBouncer will accept several hundred connections and allow up to 100 at a time to be active on the database which are VMs with ~16 CPUs allocated (some more, some less, multi-tenant and manually sharded). It sounds like you are advocating for connection max very close to the number of cores. I'd like to better understand the pros/cons of that decision.