Re: Resolving deltas dominates clone time

Jeff King <peff@xxxxxxxx> · Tue, 30 Apr 2019 16:33:53 -0400

On Tue, Apr 30, 2019 at 08:48:08PM +0200, Ævar Arnfjörð Bjarmason wrote:

> > So I'd say the right answer is probably either online_cpus() or half
> > that. The latter would be more appropriate for the machines I have, but
> > I'd worry that it would leave performance on the table for non-intel
> > machines.
> 
> It would be a nice #leftoverbits project to do this dynamically at
> runtime, i.e. hook up the throughput code in progress.c to some new
> utility functions where the current code using pthreads would
> occasionally stop and try to find some (local) maximum throughput given
> N threads.
> 
> You could then dynamically save that optimum for next time, or adjust
> threading at runtime every X seconds, e.g. on a server with N=24 cores
> you might want 24 threads if you have one index-pack, but if you have 24
> index-packs you probably don't want each with 24 threads, for a total of
> 576.

Yeah, I touched on that in my response to Martin. I think that would be
nice, but it's complicated enough that I don't think it's a left-over
bit. I'm also not sure how hard it is to change the number of threads
after the initialization.

IIRC, it's a worker pool that just asks for more work. So that's
probably the right moment to say not just "is there more work to do" but
also "does it seem like there's an idle slot on the system for our
thread to take".

-Peff