On Mon, 10 Dec 2007, Jon Smirl wrote: > On 12/10/07, Nicolas Pitre <nico@xxxxxxx> wrote: > > On Mon, 10 Dec 2007, Jon Smirl wrote: > > > > > Running oprofile during my gcc repack shows this loop as the hottest > > > place in the code by far. > > > > Well, that is kind of expected. > > > > > I added some debug printfs which show that I > > > have a 100,000+ run of identical hash entries. Processing the 100,000 > > > entries also causes RAM consumption to explode. > > > > That is impossible. If you look at the code where those hash entries > > are created in create_delta_index(), you'll notice a hard limit of > > HASH_LIMIT (currently 64) is imposed on the number of identical hash > > entries. > > On 12/10/07, Jon Smirl <jonsmirl@xxxxxxxxx> wrote: > > On 12/9/07, Jon Smirl <jonsmirl@xxxxxxxxx> wrote: > > > > + if (victim) { > > > > + sub_size = victim->remaining / 2; > > > > + list = victim->list + victim->list_size - sub_size; > > > > + while (sub_size && list[0]->hash && > > > > + list[0]->hash == list[-1]->hash) { > > > > + list++; > > > > > > I think you needed to copy sub_size to another variable for this loop > > > > Copying sub_size was wrong. I believe you are checking for deltas on > > the same file. It's probably that chain of 103,817 deltas that can't > > be broken up. > > At the end of multi-threaded repack one thread ends up with 45 minutes > of work after all the other threads have exited. That's because it > hits this loop and can't spit the list any more. > > If the lists can't be over 64 identical entries, why do I get caught > in this loop for 50,000+ iterations? If remove this loop the threads > are balanced right to the end. Completely different issue. Please read my other answers. Nicolas - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html