Tom Lane wrote:
Robert Haas <robertmhaas@xxxxxxxxx> writes:
I think that changing the locking behavior is attacking the problem at
the wrong level anyway.
Right. By the time a patch here could have any effect, you've already
lost the game --- having to deschedule and reschedule a process is a
large cost compared to the typical lock hold time for most LWLocks. So
it would be better to look at how to avoid blocking in the first place.
I think the elephant in the room is that we have a single lock that
needs to be acquired every time a transaction commits, and every time a
backend takes a snapshot. It has worked well, and it still does for
smaller numbers of CPUs, but I'm not surprised it starts to become a
bottleneck on a test like the one Jignesh is running. To make matters
worse, the more backends there are, the longer the lock needs to be held
to take a snapshot.
It's going require some hard thinking to bust that bottleneck. I've
sometimes thought about maintaining a pre-calculated array of
in-progress XIDs in shared memory. GetSnapshotData would simply memcpy()
that to private memory, instead of collecting the xids from ProcArray.
Or we could try to move some of the if-tests inside the for-loop to
after the ProcArrayLock is released. For example, we could easily remove
the check for "proc == MyProc", and remove our own xid from the array
afterwards. That's just linear speed up, though. I can't immediately
think of a way to completely avoid / partition away the contention.
WALInsertLock is also quite high on Jignesh's list. That I've seen
become the bottleneck on other tests too.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance