On 03/18/09 17:25, Robert Haas wrote: On Wed, Mar 18, 2009 at 1:43 PM, Scott Carey <scott@xxxxxxxxxxxxxxxxx> wrote:Its worth ruling out given that even if the likelihood is small, the fix is easy. However, I don¹t see the throughput drop from peak as more concurrency is added that is the hallmark of this problem < usually with a lot of context switching and a sudden increase in CPU use per transaction.The problem is that the proposed "fix" bears a strong resemblence to attempting to improve your gas mileage by removing a few non-critical parts from your card, like, say, the bumpers, muffler, turn signals, windshield wipers, and emergency brake.The fix I was referring to as easy was using a connection pooler -- as a reply to the previous post. Even if its a low likelihood that the connection pooler fixes this case, its worth looking at.Oh, OK. There seem to be some smart people saying that's a pretty high-likelihood fix. I thought you were talking about the proposed locking change.While it's true that the car might be drivable in that condition (as long as nothing unexpected happens), you're going to have a hard time convincing the manufacturer to offer that as an options package.The original poster's request is for a config parameter, for experimentation and testing by the brave. My own request was for that version of the lock to prevent possible starvation but improve performance by unlocking all shared at once, then doing all exclusives one at a time next, etc.That doesn't prevent starvation in general, although it will for some workloads. Anyway, it seems rather pointless to add a config parameter that isn't at all safe, and adds overhead to a critical part of the system for people who don't use it. After all, if you find that it helps, what are you going to do? Turn it on in production? I just don't see how this is any good other than as a thought-experiment. Actually the patch I submitted shows no overhead from what I have seen and I think it is useful depending on workloads where it can be turned on even on production. At any rate, as I understand it, even after Jignesh eliminated the waits, he wasn't able to push his CPU utilization above 48%. Surely something's not right there. And he also said that when he added a knob to control the behavior, he got a performance improvement even when the knob was set to 0, which corresponds to the behavior we have already anyway. So I'm very skeptical that there's something wrong with either the system or the test. Until that's understood and fixed, I don't think that looking at the numbers is worth much. I dont think anything is majorly wrong in my system.. Sometimes it is PostgreSQL locks in play and sometimes it can be OS/system related locks in play (network, IO, file system, etc). Right now in my patch after I fix waiting procarray problem other PostgreSQL locks comes into play: CLogControlLock, WALInsertLock , etc. Right now out of the box we have no means of tweaking something in production if you do land in that problem. With the patch there is means of doing knob control to tweak the bottlenecks of Locks for the main workload for which it is put in production. I still haven't seen any downsides with the patch yet other than highlighting other bottlenecks in the system. (For example I haven't seen a run where the tpm on my workload decreases as you increase the number) What I am suggesting is run the patch and see if you find a workload where you see a downside in performance and the lock statistics output to see if it is pushing the bottleneck elsewhere more likely WALInsertLock or CLogControlBlock. If yes then this patch gives you the right tweaking opportunity to reduce stress on ProcArrayLock for a workload while still not seriously stressing WALInsertLock or CLogControlBlock. Right now.. the standard answer applies.. nope you are running the wrong workload for PostgreSQL, use a connection pooler or your own application logic. Or maybe.. you have too many users for PostgreSQL use some proprietary database. -Jignesh I alluded to the three main ways of dealing with lock contention elsewhere. Avoiding locks, making finer grained locks, and making locks faster. All are worthy. Some are harder to do than others. Some have been heavily tuned already. Its a case by case basis. And regardless, the unfair lock is a good test tool.In view of the caveats above, I'll give that a firm maybe. ...Robert |