Re: How to avoid the XLog Write to be the performance bottleneck

Andres Freund <andres@xxxxxxxxxxx> · Fri, 21 Sep 2018 09:26:24 -0700

On 2018-09-21 08:16:42 +0000, 范国腾 wrote:
> Hi, 
> 
> We have two postgres nodes（active/standby). The active node and the standby node use the same share disk（GFS2 file system）.
> 
> We are doing the performance test in active side:
> (1)Now there is no SQL request sending to the standby side.
> (2)The active node has 20 sessions and the test tool sends INSERT/UPDATE/SELECT request to them. The call load is very high.
> 
> In active node, we find that the disk IO is very high but the CPU of each postgres process is about 20%. The pstack result shows that most of the postgres process is waiting for the XLOG Write Lock. It seems that the XLog write become the bottleneck of the postgres database.

Usually that doesn't really mean there's lock contention, but that
your IO isn't fast enough. What you can do:
a) check whether some/most/all of your transactions can use
   synchronous_commit = off - that can drastically reduce the amount of
   IO.
b) Consider putting your WAL onto a separate disk (or even partition),
   that can reduce overhead by disentangling synchronous writes (for the
   WAL) from asynchronous writes (the data being written back), and
   synchronous reads (queries).
c) Get faster storage.

Greetings,

Andres Freund