Re: [PATCH v3 3/3] index-pack: support multithreaded delta resolving

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Nguyễn Thái Ngọc Duy  <pclouds@xxxxxxxxx> writes:

> @@ -696,7 +796,31 @@ static void second_pass(struct object_entry *obj)
>  	base_obj->obj = obj;
>  	base_obj->data = NULL;
>  	find_unresolved_deltas(base_obj);
> -	display_progress(progress, nr_resolved_deltas);
> +}
> +
> +static void *threaded_second_pass(void *arg)
> +{
> +#ifndef NO_PTHREADS
> +	if (threads_active)
> +		pthread_setspecific(key, arg);
> +#endif
> +	for (;;) {
> +		int i;
> +		work_lock();
> +		display_progress(progress, nr_resolved_deltas);
> +		while (nr_processed < nr_objects &&
> +		       is_delta_type(objects[nr_processed].type))
> +			nr_processed++;
> +		if (nr_processed >= nr_objects) {
> +			work_unlock();
> +			break;
> +		}
> +		i = nr_processed++;
> +		work_unlock();
> +
> +		second_pass(&objects[i]);
> +	}
> +	return NULL;
>  }

It may be just the matter of naming, but the above is taking the
abstraction backwards, I think.  Shouldn't it be structured in such a way
that the caller may call second_pass() and its implementation may turn out
to be threaded (or not)?

The naming of "arg" made things worse.  I wasted 5 minutes scratching my
head thinking "arg" was a single specific object that was to be given to
second_pass(), and wondered why it is made into thread-local data.  Name
it "thread_data" or something.

And I think the root cause of this confusion is the way "second_pass" was
split out in the earlier patch.  It is not the entire second-pass, but is
merely a single step of it (the whole "for (i = 0; i < nr_objects; i++)"
is the second-pass, in other words), and its implementation detail might
change to either thread (i.e. instead of a single line of control
iterating from 0 to nr_objects, each thread grab the next available task
and work on it until everything is exhausted) or not.

By the way, if one object is very heavy and takes a lot of time until
completion, could it be possible that objects[0] is still being processed
for its base data but objects[1] has already completed and an available
thread could work on objects[2]?  How does it learn to process objects[2]
in such a case, or does it wait until the thread working on objects[0] is
done?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]