Re: [PATCH 2/2] daemon: plug memory leak

Jeff King <peff@xxxxxxxx> · Mon, 26 Oct 2015 15:47:53 -0400

On Sat, Oct 24, 2015 at 02:23:20PM +0200, René Scharfe wrote:

> Call child_process_clear() when a child ends to release the memory
> allocated for its environment.  This is necessary because unlike all
> other users of start_command() we don't call finish_command(), which
> would have taken care of that for us.
> 
> This leak was introduced by f063d38b (daemon: use cld->env_array
> when re-spawning).

I agree this is a leak we need to address, but I find the solution a
little unsatisfactory (but tl;dr, it's probably the best we can do).

It would be nice if we could call finish_command() here to do the
wait(). But it does not know about WNOHANG, so we would have to
introduce a new helper anyway.

However, the whole check_dead_children function is accidentally
quadratic, and should probably be refactored in general. It loops over
the list of children, calling `waitpid(pid, WNOHANG)` on each. So if `n`
children connect at one time, we make `O(n^2)` waitpid calls.

I have patches to instead loop over waitpid(-1, WNOHANG) until there are
no dead children left (and use a hash to go from pid to each "struct
child").

But the reason I haven't posted them is that I'm somewhat of the opinion
that this child-list can go away altogether. The main use is in
enforcing the max_connections variable. Just enforcing that can be done
with a counter, but we do something much weirder: when we get a new
connection, we try to kill an _existing_ child by finding the first
client with multiple connections, killing that, sleeping for a second
(!), and then if that killed something, giving the slot to the new
connection.

This has a few problems:

  1. Under non-malicious heavy load, you probably want to reject the new
     request rather than killing an existing one. If you kill a fetch
     that is 99% completed, that client is likely to come back and fetch
     again, and you've just thrown away all of the effort (both CPU and
     bandwidth) put into the failed clone. You haven't spent anything on
     the incoming connection, so you'd much rather they go away and come
     back in a moment.

  2. I think the kill_some_child algorithm was meant to enforce
     fairness so that a single IP cannot monopolize all of your slots.
     I'm not sure that's realistic for a few reasons:

       a. Real bad guys will just hit you with a DDoS from many IPs.

       b. Real sites have load-balancing in front of git-daemon, and the
          client IPs may not be meaningful.

       c. The duplicate selection is pretty naive. A bad guy with tons
	  of duplicate connections is no more likely to be killed than a
	  normal client with exactly two connections (it actually
	  depends solely on when the latest connection from either came
	  in). So I suspect a determine attacker with a single IP could
	  still present a problem.

  3. Calling sleep() in the server parent process is a great way to kill
     performance. :)

Of course my view is somewhat skewed by running a really big site with
load balancers and such (and we have long set --max-connections high
enough that it has never been hit). Maybe it's more realistic protection
for a "normal" site.

But my inclination would be to drop kill_some_child entirely in favor of
simply rejecting new connections that are over the limit, and getting
rid of the child-list entirely (and calling waitpid only to reap the
zombie PIDs).

I guess we'd still need something like child_process_clear(), though. We
could just run it immediately after spawning the child, rather than when
we cleaned it up.  So in that sense this series is the first incremental
step toward that future, anyway.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html