Trevor Talbot wrote: > On 8/23/07, Magnus Hagander <magnus@xxxxxxxxxxxx> wrote: > > Shelby Cain wrote: > > > > Wild guess on my part... could that error be the result of an attempt > > > to map shared memory into a process at a fixed location that just > > > happens to already be occupied by a dll that Windows had decided to > > > relocate? > > > > Not that wild a guess, really :-) I'd say it's a very good possibility - > > but I have no idea why it'd do that, since all backends load the same > > DLLs at that stage. > > Not a valid assumption; you can't rely on consistent VM space among > multiple [non-cloned] processes without a serious amount of effort. > Anything can use that space, it's not just file views. Obviously it > happens to work some of the time, but when it doesn't, it doesn't. I > gather postgres depends on it being at the same address, and fixing > that isn't trivial? > > If everything relevant is going through the intriguing > internal_forkexec(), you could probably reserve address space there > before resuming the thread. You'd want to combine this with picking > address space that's less likely to be used before creating the shared > memory section. (Actually, if you're doing that, you might as well > just inject the backend variables too instead of going through the > mapped file gymnastics.) > > Not a simple change, but would likely make this particular problem go > away (assuming this is the problem). It's also the first time I've > looked at the source, so perhaps I missed something. I think this is accurate. When we created the Win32 native port there was a lot of concern about how to handle shared memory in a BACKEND_EXEC case, namely that postmaster children were not copies which had the same shared memory mappings, but rather were new processes that had to attach to shared memory at a fixed address. The WIN32 solution was to create the shared memory in the parent, and then pass that address value down to the children to use in attaching to the existing segment. We expected all sorts of problems with this but in fact it seemed to work fine (most of the time). As you can see it doesn't work 100% of the time, but it worked more reliabily than we expected. What we have been waiting for is someone who can recreate a failure so we can track down how to best make it 100% reliable, and as you can see, we haven't had a flood of problem reports to track this down. If you want to help make it 100% we will work with you to find the solution. -- Bruce Momjian <bruce@xxxxxxxxxx> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@xxxxxxxxxxxxxx so that your message can get through to the mailing list cleanly