On Fri, May 25, 2012 at 1:02 PM, Tom Lane <tgl@xxxxxxxxxxxxx> wrote: > Jeff Janes <jeff.janes@xxxxxxxxx> writes: >> For dumping entire databases, It looks like the biggest problem is >> going to be LockReassignCurrentOwner in the server. And that doesn't >> seem to be easy to fix, as any change to it to improve pg_dump will >> risk degrading normal use cases. > >> If we want to be able to efficiently dump entire databases in a >> scalable way, it seems like there should be some way to obtain a >> data-base-wide AccessShare lock, which blocks AccessExclusive locks on >> any object in the database, and turns ordinary object-level >> AccessShare lock requests into no-ops. > > I thought a little bit about that, but it seems fairly unworkable. > In the first place, pg_dump doesn't necessarily want lock on every table > in the database. The database-wide method could be invoked only when there are no options given to pg_dump that limit to a subset. Or does that not resolve the objection? > In the second, such a lock mechanism would have > logical difficulties, notably whether it would be considered to apply to > tables created after the lock request occurs. If it does, then it would > effectively block all such creations (since creation takes exclusive > locks that ought to conflict). That seems acceptable to me. With unrestricted dump, almost all other DDL is locked out already, I don't know that locking out one more thing is that big a deal. Especially if there is some way to circumvent the use of that feature. > If it doesn't, how would you implement > that? In any case, we'd be adding significant cost and complexity to > lock acquisition operations, for something that only whole-database > pg_dump operations could conceivably make use of. Before Robert's fast-path locks were developed, I wanted a way to put the server into 'stable schema' mode where AccessExclusive locks were forbidden and AccessShared were no-ops, just for performance reasons. Now with fast-path, that might no longer be a meaningful feature. If databases scale out a lot, won't max_locks_per_transaction, and the amount of shared memory it would require to keep increasing it, become a substantial problem? > As far as the specific problem at hand goes, I think there might be a > less invasive solution. I poked into the behavior with gdb (and you're > right, LockReassignCurrentOwner does get called during portal drop) > and noted that although pg_dump is indeed holding thousands of locks, > any given statement that it issues touches only a few of them. So the > loop in LockReassignCurrentOwner iterates over the whole lock table but > does something useful at only a few entries. > > We could fix things for this usage pattern with what seems to me to > be a pretty low-overhead method: add a fixed-size array to > ResourceOwners in which we can remember up to N LOCALLOCKs, for N around > 10 or so. I had thought along these terms too. I think 10 would capture most of the gain. with pg_dump, so far I see a huge number of resource owners with maximum number of locks being 0, 2 or 4, and only a handful with more than 4. Of course I haven't looked at all use cases. The reason we want to limit at all is not memory, but rather so that explicitly removing locks doesn't have to dig through a large list to find the specific one to remove, therefore become quadratic in the case that many locks are explicitly removed, right? Does anyone ever add a bunch of locks, and then afterward go through and explicitly remove them all in FIFO order? I think most users would either remove them LIFO, or drop them in bulk. But better safe than sorry. > Add a LOCALLOCK to that array when we add the ResourceOwner to > that LOCALLOCK, so long as the array hasn't overflowed. (If the array > does overflow, we mark it as overflowed and stop adding entries.) Then, > in LockReassignCurrentOwner, we only iterate over the whole hash table > if the ResourceOwner's array has overflowed. If it hasn't, use the > array to visit just the LOCALLOCKs that need work. > > Comments? I have some basic parts of this already coded up. I can try to finish coding this up for CF next or next+1. I'm not yet sure how to avoid weakening the boundary between resowner.c and lock.c, my original code was pretty ugly there, as it was just a proof of concept. What would be a situation that might be adversely affected by the overhead of such a change? I think pgbench -S except implemented in a plpgsql loop would probably do it. Cheers, Jeff -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance