Re: Is Git multithreaded ?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jun 12, 2013 at 08:28:52PM +0200, Laurent Alebarde wrote:

> I wonder if Git is multithreaded ?

A few selected operations are multi-threaded if you compile with
thread support (i.e., do not set NO_PTHREADS when you build).

> For example, during a commit, does it process the files one after one,
> or does it use a set of threads, say 10, to process 10 files in
> parrallel ?

Commit is not multi-threaded, for example.

> In the Git_Guide (http://wiki.sourcemage.org/Git_Guide.html), I can
> read this :
> 
> "T/o enable aut-detection for number of threads to use (good for
> multi-CPU or multi-core computers) for packing repositories, use:

But object packing (used during fetch/push, and during git-gc) is
multi-threaded (at least the delta compression portion of it is).

> But it is not a lot explanatory (to me). In particular, if Git is
> multithreded and can be configured regarding the number of workers, I
> wonder in which operations it uses it ?

There is no master list, and the set of threaded operations changes from
version to version. If you have a clone of the git source code, you can
find the places where threads are used with

  git grep NO_PTHREADS

as every threaded spot also has a single-threaded variant.

The current list is something like:

  - finding delta candidates during pack-objects (gc, server side of
    fetch, client side of push); controlled by pack.threads, which
    defaults to "number of CPUs you have"

  - resolving received objects in index-pack via fetch; controlled by
    pack.threads

  - git grep on a working tree (I do not recall the details, but I think
    grepping a commit actually ends up slower when parallel); I do not
    think there is config to control this

  - when stat()-ing files to refresh the index. This is not about
    parallel CPU performance, but about reducing latency on slow
    filesystems (e.g., NFS) by pipelining requests; controlled by
    core.preloadindex, which defaults to "false"

  - git may fork to perform certain asynchronous operations (e.g.,
    during a fetch, one process runs pack-objects to create the output,
    and the other speaks the git protocol, mostly just passing through
    the output to the client. On systems with threads, some of these
    operations are performed using a thread rather than fork. This is
    not about CPU performance, but about keeping the code simple (and
    cannot be controlled with config).

I hope that helps.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]