Sorry for the long delay in following up on this suggestion. The change Tom suggested fixed the performance problems I was seeing, but I never ran the full regression suite on the modified code, as everything in my performance tests seemed to indicate the bug was fixed (i.e, no errors even with --cassert-enabled). When I recently ran the regression suite on the "fixed" version of Postgres, the "misc" test suite fails with the following error message: "ERROR: cache lookup failed for type 2139062143". Is this a manifestation of the problem where certain items are being freed too early? Any other ideas as to what's going on here? Thanks, John On Tue, Jan 8, 2008 at 8:51 PM, Tom Lane <tgl@xxxxxxxxxxxxx> wrote: > "John Smith" <sodgodofall@xxxxxxxxx> writes: >>> It's pipelined either way. But int8 is a pass-by-reference data type, >>> and it sounds like we have a memory leak for this case. > >> Thanks for your reply. How easy is it to fix this? Which portion of >> the code should we look to change? > > I was just looking at that. The issue looks to me that nodeResult.c > (and other plan node types that support SRFs in their targetlists) > do this: > > /* > * Check to see if we're still projecting out tuples from a previous scan > * tuple (because there is a function-returning-set in the projection > * expressions). If so, try to project another one. > */ > if (node->ps.ps_TupFromTlist) > { > resultSlot = ExecProject(node->ps.ps_ProjInfo, &isDone); > if (isDone == ExprMultipleResult) > return resultSlot; > /* Done with that source tuple... */ > node->ps.ps_TupFromTlist = false; > } > > /* > * Reset per-tuple memory context to free any expression evaluation > * storage allocated in the previous tuple cycle. Note this can't happen > * until we're done projecting out tuples from a scan tuple. > */ > ResetExprContext(econtext); > > whereas there would be no memory leak if these two chunks of code were > in the other order. The question is whether resetting the context first > would throw away any data that we *do* still need for the repeated > ExecProject calls. That second comment block implies there's something > we do need. > > I'm not sure why it's like this. Some digging in the CVS history shows > that indeed the code used to be in the other order, and I switched it > (and added the second comment block) in this old commit: > > http://archives.postgresql.org/pgsql-committers/2000-08/msg00218.php > > I suppose that the SQL-function support at the time required that its > calling memory context be persistent until it returned ExprEndResult, > but I sure don't recall any details. It's entirely possible that that > requirement no longer exists, or could easily be eliminated given all > the other changes that have happened since then. nodeFunctionscan.c > seems to reset the current context for each call of a SRF, so I'd think > that anything that can't cope with that should have been flushed out > by now. > > If you feel like poking at this, I *strongly* recommend doing your > testing in an --enable-cassert build. You'll have no idea whether you > freed stuff too early if you don't have CLOBBER_FREED_MEMORY enabled. > > regards, tom lane >