[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, May 14, 2014 at 05:26:29PM +1000, Michael Ellerman wrote:

 > >  > Not sure what the correct fix is.
 > > 
 > > I think just clearing mainpid before we call exit is the right thing to
 > > do here.  I'll audit all the other exit() calls too, as this might be a
 > > problem in other paths.
 > 
 > Thanks. That fix is working for me.
 > 
 > It still exits after a minute or so, because it fails to fork a child in
 > fork_children().
 > 
 > I have 64 cpus and 16GB of RAM, so that's only 250MB per child.
 > 
 > If I reduce to 32 children then it runs much longer.
 > 
 > I wonder though, should failing to fork a child be a fatal error? Or could it
 > just skip that child and continue.

Maybe.  It could wait until another child exits before retrying.
Something like the patch below maybe.  I think I tried something like
this before though, and it resulted in a flood of failed forks.

Let me know how this work out.

	Dave

diff --git a/main.c b/main.c
index f393f81ae0ba..be7108287dc9 100644
--- a/main.c
+++ b/main.c
@@ -79,6 +79,10 @@ static void fork_children(void)
 			_exit(EXIT_SUCCESS);
 		} else {
 			if (pid == -1) {
+				/* We failed, wait for a child to exit before retrying. */
+				if (shm->running_childs > 0)
+					return;
+
 				output(0, "couldn't create child! (%s)\n", strerror(errno));
 				shm->exit_reason = EXIT_FORK_FAILURE;
 				exit_main_fail();
--
To unsubscribe from this list: send the line "unsubscribe trinity" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux SCSI]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux