Alan, I tried as you suggested, I believe the gdb debugger is giving some false indication about threads. Whether I attach to a newly launched backend or a backend that has been executing the suspect perlu function. The “info threads” result is two. Suspiciously they are both at the same location. e.g. * 2 Thread 802c06400 (LWP 101353) 0x000000080bfa50a3 in Perl_fbm_instr () from /usr/local/lib/perl5/5.18/mach/CORE/libperl.so.5.18 * 1 Thread 802c06400 (LWP 101353) 0x000000080bfa50a3 in Perl_fbm_instr () from /usr/local/lib/perl5/5.18/mach/CORE/libperl.so.5.18 That seemed odd to me. If we use ‘top’ or ‘ps axuwwH’ to get a thread count for a given process the indication is only one thread for the same situations. I am now pursuing a different causal hypothesis. There are instances of another segmentation fault that do not involve this perl fx. Rather it is a function that is also called regularly even on a basically idle system. Therefore it is perhaps happenstance as to which kind might happen. I believe this may relate to our update process. Product developers are frequently updating (daily) environments/packages while running postgres and possibly our application. I am thinking this update process is not properly coordinating with a running postgres and may result in occasional shared library issues. This thought is consistent in that our production testers who update at a much lower frequency almost never see this segmentation fault problem but use the same update script. I’ll attempt some scripts changes and meanwhile ask the developers to make observations that would support this idea. I’ll update the thread with the future observations/outcome. Possibly changing the subject to careless developers cause segmentation fault Thanks for your assistance on this matter. Dave From: Alex Hunsaker [mailto:badalex@xxxxxxxxx] On Thu, Jan 29, 2015 at 1:54 PM, Day, David <dday@xxxxxxxxxx> wrote: Thanks for the inputs, I’ll attempt to apply it and will update when I have some new information. BTW a quick check would be to attach with gdb right after you connect, check info threads (there should be none), run the plperlu procedure (with the right arguments/calls to hit all the execution paths), check info threads again. If info threads now reports a thread, we are likely looking at the right plperlu code. It should just be a matter of commenting stuff out to deduce what makes the thread. If not, it could be that plperlu is not at fault and its something else like an extension or some other procedure/pl. |