On 2018-09-21 08:16:42 +0000, 范国腾 wrote: > Hi, > > We have two postgres nodes(active/standby). The active node and the standby node use the same share disk(GFS2 file system). > > We are doing the performance test in active side: > (1)Now there is no SQL request sending to the standby side. > (2)The active node has 20 sessions and the test tool sends INSERT/UPDATE/SELECT request to them. The call load is very high. > > In active node, we find that the disk IO is very high but the CPU of each postgres process is about 20%. The pstack result shows that most of the postgres process is waiting for the XLOG Write Lock. It seems that the XLog write become the bottleneck of the postgres database. Usually that doesn't really mean there's lock contention, but that your IO isn't fast enough. What you can do: a) check whether some/most/all of your transactions can use synchronous_commit = off - that can drastically reduce the amount of IO. b) Consider putting your WAL onto a separate disk (or even partition), that can reduce overhead by disentangling synchronous writes (for the WAL) from asynchronous writes (the data being written back), and synchronous reads (queries). c) Get faster storage. Greetings, Andres Freund