In regard to: Re: EPIPE, Michael Natterer said (at 8:22pm on May 8, 2000): >Austin Donnelly wrote: >> >> Current gimp (1.1.21) seems to have problems with recovering from any >> plugin that dies. Things start going wrong when it takes a SIGPIPE >> while trying to write(read?) to the pipe to the plugin which is dead. >> Rather than ignoring SIGPIPE, and collecting an EPIPE from the io >> operation and using this to trigger dead plugin cleanup operations, >> gimp currently treats SIGPIPE just like any other signal: it's fatal. >> Unfortunately, while attempting to print out some error message or >> other, gimp causes a segfault. This might be due to non-reentrant >> stdio libraries in use, I don't know. According to POSIX, the only >> thing you're allowed to do is read or write variables of type >> sigatomic_t. Calling libc funcitions (including printf()) sounds like >> a recipe for disaster, especially with a non-reentrant libc. >> >> This needs some more thought, and I don't have much time right now to >> look into any more. I'm pretty sure that plugins were correctly >> cleaned up on their unexpected termination at some earlier stage. The >> whole point of plugins being separate processes is that a plugin >> should be unable to cause the main gimp app to crash: if they can then >> this is a fairly critical bug that should be fixed. > >Hi Austin, > >Unfortunately this is not the reason why gimp dies on just any aborting >child. Although I 100% agree that SIGPIPE being fatal is the wrong thing >to do. I browsed CVS and Gimp is connecting SIGPIPE to on_signal() since >the beginning of CVS time. > >Mysteriously, as you mention, it worked before. I'm also pretty sure that >the current state of signal processing behaves exactly like before Garry >started to fix the SIGCHLD bug. Actually, it was Austin's diagnosis (the hard part!) and initial code, with Garry fleshing it out and polishing it up in Austin's abscense, and comments from the peanut gallery (me and you, mainly). The current state of signal processing does *not* behave exactly as before: plug-in query works on alpha-dec-osf now. Using sigaction instead of signal should improve things across the board, on all platforms. signal() on HP-UX and Solaris was the SYSV signal, where the handler needed to be reset after every signal (and there was therefore a race condition), so even though major problems regarding signal handling were never reported on those platforms, they were lurking. The SIGPIPE problem is because on_signal is currently treating it as a fatal signal (see the case on_signal in app/main.c). The on_signal routine should probably be modified to not treat SIGPIPE as fatal. That should fix the problem Austin is seeing (that others will no doubt see too). Someone should investigate the handler in 1.1.19 or earlier, and see what was being done on SIGPIPE there. Austin is also correct that calling printf from the handler is probably asking for trouble. If a message must be written, some other method should be found (write *is* ok from a signal handler, but won't using *that* be fun...). Tim -- Tim Mooney mooney@xxxxxxxxxxxxxxxxxxxxxxxxx Information Technology Services (701) 231-1076 (Voice) Room 242-J1, IACC Building (701) 231-8541 (Fax) North Dakota State University, Fargo, ND 58105-5164