Postgresql TPS Bottleneck

<wakandavision@xxxxxxxxxxx> · Thu, 31 Mar 2022 11:50:34 +0000

Hi everyone,

I am a bachelor's student and writing my thesis about the scaling and

performance of an application. The application is using postgresql as a

database but we can't scale any further currently as it seems postgres

is hitting the limit.

With the application, as well as with pgbench, we don't get more than

(max) 70k TPS on postgres. But the servers' resources are not utilized

completely (more below).

I've tried many different configurations but none of them had any major

performance impact (unless fsync and synchronous_commit = off).

This is the (custom) configuration I am using:

shared_buffers=65551953kB

effective_cache_size=147491895kB

huge_pages=on

min_wal_size=20GB

max_wal_size=200GB

wal_buffers=1GB

max_wal_senders=0

archive_mode=off

wal_level=minimal

work_mem=2GB

maintenance_work_mem=4GB

checkpoint_completion_target=0.9

checkpoint_timeout = 30min

random_page_cost=1.1

bgwriter_flush_after = 2MB

effective_io_concurrency = 200

# Disabled just for performance experiments

fsync = off

synchronous_commit = off

full_page_writes = on

max_worker_processes=64

max_parallel_workers=64

max_parallel_workers_per_gather=10

max_parallel_maintenance_workers=12

The system is as follows:

* 64 Cores (Intel Xeon Gold 6130 (Skylake, 2.10GHz, 2 CPUs/node, 16

cores/CPU))

* 192 GiB RAM (12 * 16GiB DIMM DDR4 Synchronous Registered (Buffered)

2666 MHz (0.4 ns))

* 2 * SSD SATA Samsung MZ7KM240HMHQ0D3 (one is used for the WAL and the

other for the data)

* 10 Gbps network link

* OS: Debian 11

* Postgres 13 from apt

(I've also written a stackoverflow post about it -

https://stackoverflow.com/questions/71631348/postgresql-bottleneck-neither-cpu-network-nor-i-o )

performance - Postgresql bottleneck neither CPU, network
 nor I/O - Stack Overflow

We are testing our application for performance, which is using Postgresql 13 as a database. It is very insert and update heavy and we cannot get more than 65k TPS on the database. But none of the m...

stackoverflow.com

Below is just an example of the pgbench I ran:

pgbench -i -s 50 -U postgres -h <DB_HOST> -d <DB_NAME>

pgbench -c 64 -j 32 -t 100000 -h <DB_HOST> -U postgres <DB_NAME>

starting vacuum...end.

transaction type: <builtin: TPC-B (sort of)>

scaling factor: 50

query mode: simple

number of clients: 64

number of threads: 32

number of transactions per client: 100000

number of transactions actually processed: 6400000/6400000

latency average = 0.976 ms

tps = 65584.664360 (including connections establishing)

tps = 65594.330678 (excluding connections establishing)

As comparison (average of three runs with pgbench as above):

num clients     default config      custom config above

10              11336               16848

20              19528               30187

30              25769               39430

40              29792               50150

50              31096               60133

60              33900               64916

70              34986               64308

80              34170               63075

90              35108               59910

100             34864               58320

120             35124               55414

140             33403               53610

(with fsync=off alone I almost get the TPS from the right already)

For `-S -M prepared` the TPS is ~700k and for `-S` ~500k but as the

application is very write heavy this is not really useful for me.

With the app the CPU is only at 25% load and the disks are also no

problem. For pgbench its about 75% CPU but still no disk bottleneck

(about 5%).

There are also Grafana snapshots I created for the system (node-

exporter) and postgres (prometheus-postgres-exporter) while running

with our application (same configuration as above). Both do not show

any kind of bottleneck (except high amounts context switches and pages

in/out)

node: 
https://147.87.255.221:3000/dashboard/snapshot/3eXe1sS3QDL6cbvI7HkPjYnjrVLCNOOF

postgres: 
https://147.87.255.221:3000/dashboard/snapshot/wHkRphdr3D4k5kRckhn57Pc6ZD3st1x7

I have also looked at postgresql's lock tables while running the above

experiment, but there is nothing which seemed strange to me. There are

about 300 locks but all are granted (select * from pg_locks). 

Also, the following query:

select wait_event, count(*) from pg_stat_activity where state='idle in

transaction' group by wait_event;

did not show some contention there the output looks always similar to

this (80 clients):

    wait_event                  | count 

--------------------------+-------

 ClientRead                     |     2

 SerializableFinishedList  |     1

Thanks to the slack channel I got a link to edb which used a more

powerful server and they achieved also about 70k TPS but did not set

fsync=off. So maybe they were limited by disk IO (just guessing, as

unfortunately, it is not pointed out in the post).

https://www.enterprisedb.com/blog/pgbench-performance-benchmark-postgresql-12-and-edb-advanced-server-12

So, my question is if anyone knows what could be the bottleneck, or if

it is even possible to get more TPS in this write-heavy load.

(dmesg does also not contain error messages which would point to a

kernel misconfiguration)

Optimally I would like to fully use the CPU and get about 3-4 times

more TPS (if even possible).

Thanks already for everyone's time and help.