Re: FW: huge SubtransSLRU and SubtransBuffer wait_event

James Pang <jamespang886@xxxxxxxxx> · Mon, 26 Feb 2024 14:01:57 +0800

 From this link, looks like "onfigurable buffer pool and partitioning the SLRU lock" is one the plan,  maybe from v18,19 version,  https://www.postgresql.org/message-id/202402221843.ibzvpndbacbi@alvherre.pgsql

    James  

From: James Pang (chaolpan) 

Sent: Tuesday, February 6, 2024 2:59 PM

To: Nikolay Samokhvalov <samokhvalov@xxxxxxxxx>; Laurenz Albe <laurenz.albe@xxxxxxxxxxx>; pgsql-performance@xxxxxxxxxxxxxxxxxxxx

Subject: RE: huge SubtransSLRU and SubtransBuffer wait_event

   We finally identified the cause, a pl/pgsql procedure  proc1 (for 1…5000 loop  call proc2()); proc2 (begin ..exception..end); at the same time, more than 200 sessions coming in milliseconds and do same query during the “call proc1 long
 running transaction”.  The code change and cutdown the parallel sessions count doing same query at the same time help a lot.

   Thanks all. 

James 

From: Nikolay Samokhvalov <samokhvalov@xxxxxxxxx>

Sent: Friday, February 2, 2024 6:04 PM

To: Laurenz Albe <laurenz.albe@xxxxxxxxxxx>;
pgsql-performance@xxxxxxxxxxxxxxxxxxxx

Subject: Re: huge SubtransSLRU and SubtransBuffer wait_event

On Thu, Feb 1, 2024 at 04:42 Laurenz Albe <laurenz.albe@xxxxxxxxxxx> wrote:

Today, the only feasible solution is not to create more than 64 subtransactions

(savepoints or PL/pgSQL EXCEPTION clauses) per transaction.

Sometimes, a single subtransaction is enough to experience a bad SubtransSLRU spike: 

https://postgres.ai/blog/20210831-postgresql-subtransactions-considered-harmful#problem-4-subtrans-slru-overflow

I think 64+ nesting level is quite rare, but this kind of problem that hits you when you have high XID growth (lots of writes) + long-running transaction is quite easy to bump into. Or this case involving MultiXactIDs: 

https://buttondown.email/nelhage/archive/notes-on-some-postgresql-implementation-details/

Nik