Re: [PATCH v3 3/3] index-pack: support multithreaded delta resolving

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Nguyen Thai Ngoc Duy <pclouds@xxxxxxxxx> writes:

> Junio, can you make a patch (or update current ones) with better
> naming? I obviously did not see why these names were bad. You may
> provide more convincing explanation than me in the commit message.

I have already explained s/nr_processed/nr_dispatched/.

I do not think second_pass vs threaded_second_pass is merely a naming
issue. The code structure ("taking the abstraction backwards") is the
bigger issue.

The original before the series has:

	parse_pack_objects()
	{
		... a lot of code for the first pass ...
		
		... a lot of code for the second pass which looks like
		for (all object entries) {
			set global state base_obj->obj to it
			find unresolved deltas for that single base_obj
			display progress
		}
	}

and you first split one iteration of the second pass, resulting in

	parse_pack_objects()
	{
		... a lot of code for the first pass ...
		
		for (all object entries) {
			set global state base_obj->obj to it
			second_pass()
		}
	}

which I think is a wrong way to do it, because here is what your next step
ends up with because of that:

	parse_pack_objects()
	{
		... a lot of code for the first pass ...
		
	#if WE SUPPORT THREADING
		for (number of threads we are going to spawn) {
			spawn thread and let it run threaded_second_pass
		}
		for (all threads) {
			cull them when they are done
		}
		return
	#endif
		... when not threading ...
		run threaded_second_pass() in the main process
	}

It could (and should) be like this instead from the beginning, I think:

	parse_pack_objects()
	{
		... a lot of code for the first pass ...
		second_pass()
	}

to encapsulate the whole second_pass() logic in the refactored function,
whose implementation (i.e. how the objects to be processed are picked up
and worked on) will change in the threaded version.  In the first
refactoring patch, second_pass() would still iterate over object entries:

	second_pass()
	{
		for (all object entries) {
			resolve_deltas(base_obj)
		}
	}

And at this point, introduce a helper function "resolve_deltas(base)"
or something that is your "second_pass()".  This deals with only one
family of objects that use the given object as their (recursive) base
objects, and use it in the above loop.

And then multi-threading support can turn that into something like

	second_pass()
	{
	#if WE SUPPORT THREADING
		for (number of threads we are going to spawn) {
			spawn thread and let it run threaded_second_pass
		}
		for (all threads) {
			cull them when they are done
		}
		return
	#endif
		... when not threading ...
		for (all object entries) {
			resolve_deltas(base_obj)
		}
	}

to add support for threaded case.  The threaded_second_pass function does
only one thing and one thing well:

	threaded_second_pass()
	{
		set thread local
		loop forever {
			take work item under critical section
			break if there is no more work
			resolve_deltas(the work item)
		}
	}

Wouldn't the result be much more readable that way?

You may have saved a few lines by making unthreaded code call your
threaded_second_pass() and pretend that the single main process still has
to pick up a work item and work on it in a loop like you did, but I think
it made the logic unnecessarily harder to follow.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]