Re: Make the git codebase thread-safe

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Feb 12, 2014 at 12:06 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote:
> Stefan Zager <szager@xxxxxxxxxxxx> writes:
>
>> ...  I used the Very Sleepy profiler
>> to see where all the time was spent on Windows: 55% of the time was
>> spent in OpenFile, and 25% in CloseFile (both in win32).
>
> This is somewhat interesting.
>
> When we check things out, checkout_paths() has a list of paths to be
> checked out, and iterates over them and call checkout_entry().
>
> I wonder if you can:
>
>  - introduce a version of checkout_entry() that takes file
>    descriptors to write to;
>
>  - have an asynchronous helper threads that pre-open the paths to be
>    written out and feed <ce, file descriptor to be written> to a
>    queue;
>
>  - restructure that loop so that it reads the <ce, file descriptor
>    to be written> from the queue, performs the actual writing out,
>    and then feeds <file descriptor to be closed> to another queue; and
>
>  - have another asynchronous helper threads that reads <file
>    descriptor to be closed> from the queue and close them.
>
> Calls to write (and preparation of data to be written) will then
> remain single-threaded, but it sounds like that codepath is not the
> bottleneck in your measurement, so....

Yes, I considered that as well.  At a minimum, that would still
require attr.c to implement thread locking, since attribute files must
be parsed to look for stream filters.  I have already done that work.

But I'm not sure it's the best long-term approach to add convoluted
custom threading solutions to each git operation as it appears on the
performance radar.  I'm hoping to make the entire code base more
thread-friendly, so that threading can be added in a more natural and
idiomatic (and less painful) way.

For example, the most natural way to add threading to checkout would
be in the loops over the index in check_updates() in unpack-trees.c.
If attr.c and sha1_file.c were thread-safe, then it would be possible
to thread checkout entirely in check_updates(), with a pretty compact
code change.  I have already done the work in attr.c; sha1_file.c is
hairier, but do-able.

Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]