Junio C Hamano <gitster@xxxxxxxxx> writes: > Nguyễn Thái Ngọc Duy <pclouds@xxxxxxxxx> writes: > >> @@ -696,7 +796,31 @@ static void second_pass(struct object_entry *obj) >> base_obj->obj = obj; >> base_obj->data = NULL; >> find_unresolved_deltas(base_obj); >> - display_progress(progress, nr_resolved_deltas); >> +} >> + >> +static void *threaded_second_pass(void *arg) >> +{ >> +#ifndef NO_PTHREADS >> + if (threads_active) >> + pthread_setspecific(key, arg); >> +#endif >> + for (;;) { >> + int i; >> + work_lock(); >> + display_progress(progress, nr_resolved_deltas); >> + while (nr_processed < nr_objects && >> + is_delta_type(objects[nr_processed].type)) >> + nr_processed++; >> + if (nr_processed >= nr_objects) { >> + work_unlock(); >> + break; >> + } >> + i = nr_processed++; >> + work_unlock(); >> + >> + second_pass(&objects[i]); >> + } >> + return NULL; >> } > > It may be just the matter of naming, but the above is taking the > abstraction backwards, I think. Shouldn't it be structured in such a way > that the caller may call second_pass() and its implementation may turn out > to be threaded (or not)? > > The naming of "arg" made things worse. I wasted 5 minutes scratching my > head thinking "arg" was a single specific object that was to be given to > second_pass(), and wondered why it is made into thread-local data. Name > it "thread_data" or something. > > And I think the root cause of this confusion is the way "second_pass" was > split out in the earlier patch. It is not the entire second-pass, but is > merely a single step of it (the whole "for (i = 0; i < nr_objects; i++)" > is the second-pass, in other words), and its implementation detail might > change to either thread (i.e. instead of a single line of control > iterating from 0 to nr_objects, each thread grab the next available task > and work on it until everything is exhausted) or not. > > By the way, if one object is very heavy and takes a lot of time until > completion, could it be possible that objects[0] is still being processed > for its base data but objects[1] has already completed and an available > thread could work on objects[2]? How does it learn to process objects[2] > in such a case, or does it wait until the thread working on objects[0] is > done? Please disregard the "By the way" part, except that my confusion that led to the "By the way" comment was caused by another misnaming, namely, "nr_processed". It is not counting "How many of them have we already processed?"---it merely counts "How many of them have we dispatched?" and completion of the task does not matter in this critical section, which I missed. If it were named "nr_dispatched", I wouldn't have wasted my time wondering about the loop and writing the "By the way" review comment. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html