On Mon, 10 Dec 2007, Jon Smirl wrote: > On 12/10/07, Nicolas Pitre <nico@xxxxxxx> wrote: > > > The hash in the code above has to do with the file names the > > corresponding objects are coming from. > > So can we change this loop to exit after a max of window_size * 10 or > something like that iterations? Without capping it the threads become > way unbalanced in the end. In the gcc case one thread is continuing > 30+ minutes past the others exiting. Indeed, some more tweaking are needed. The object path distribution goes like this for the gcc repo: 105557 gcc 42537 gcc/ChangeLog 25210 gcc/config 20690 gcc/testsuite 13434 gcc/testsuite/ChangeLog 12363 libstdc++-v3 9346 gcc/cp 8757 libstdc++-v3/include 8186 gcc/version.c 7867 gcc/cp/ChangeLog 7737 libstdc++-v3/include/bits 7653 libjava 6577 gcc/testsuite/gcc.dg 5942 libjava/ChangeLog 5351 gcc/config/i386 5260 gcc/testsuite/g++.dg 4451 gcc/f 4330 libstdc++-v3/ChangeLog 4321 libstdc++-v3/include/bits/c++config 4316 gcc/doc [...] So... the top entries are most certainly going to create load balancing issues if their path hash clustering isn't broken. Nicolas - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html