Re: Parallel checkout (Was Re: 0 bot for Git)

Stefan Beller <sbeller@xxxxxxxxxx> · Fri, 15 Apr 2016 08:08:03 -0700

On Fri, Apr 15, 2016 at 2:51 AM, Duy Nguyen <pclouds@xxxxxxxxx> wrote:
> On Fri, Apr 15, 2016 at 12:04:49AM +0200, Christian Couder wrote:
>> On Tue, Apr 12, 2016 at 4:59 PM, Stefan Beller <sbeller@xxxxxxxxxx> wrote:
>> > On Tue, Apr 12, 2016 at 2:42 AM, Duy Nguyen <pclouds@xxxxxxxxx> wrote:
>> >>> On Mon, Apr 11, 2016 at 7:51 AM, Stefan Beller <sbeller@xxxxxxxxxx> wrote:
>> >>>> Hi Greg,
>> >>>>
>> >>>> Thanks for your talk at the Git Merge 2016!
>> >>
>> >> Huh? It already happened?? Any interesting summary to share with us?
>>
>> There is a draft of an article about the first part of the Contributor
>> Summit in the draft of the next Git Rev News edition:
>>
>> https://github.com/git/git.github.io/blob/master/rev_news/drafts/edition-14.md
>
> Thanks. I read the sentence "This made people mention potential
> problems with parallelizing git checkout" and wondered what these
> problems were. And because it's easier to think while you test
> something, to flesh out your thoughts, I wrote the below patch, which
> does parallel checkout with multiple worker processes. I wonder if the
> same set of problems apply to it.

I mentioned it, as Jonathan mentioned it a while ago. (So I was just
referring to hear say).

>
> The idea is simple, you offload some work to process workers. In this
> patch, only entry.c:write_entry() is moved to workers. We still do
> directory creation and all sort of checks and stat refresh in the main
> process. Some more work may be moved away, for example, the entire
> builtin/checkout.c:checkout_merged().
>
> Multi process is less efficient than multi thread model. But I doubt
> we could make object db access thread-safe soon. The last discussion
> was 2 years ago [1] and nothing much has happened.
>
> Numbers are encouraging though. On linux-2.6 repo running on linux and
> ext4 filesystem, checkout_paths() would dominate "git checkout :/".
> Unmodified git takes about 31s.

Please also benchmark "make build" or another read heavy operation
with these 2 different checkouts. IIRC that was the problem. (checkout
improved, but due to file ordering on the fs, the operation afterwards
slowed down, such that it became a net negative)

>
>
> 16:26:00.114029 builtin/checkout.c:1299 performance: 31.184973659 s: checkout_paths
> 16:26:00.114225 trace.c:420             performance: 31.256412935 s: git command: 'git' 'checkout' '.'
>
> When doing write_entry() on 8 processes, it takes 22s (shortened by ~30%)
>
> 16:27:39.973730 trace.c:420             performance: 5.610255442 s: git command: 'git' 'checkout-index' '--worker'
> 16:27:40.956812 trace.c:420             performance: 6.595082013 s: git command: 'git' 'checkout-index' '--worker'
> 16:27:41.397621 trace.c:420             performance: 7.032024175 s: git command: 'git' 'checkout-index' '--worker'
> 16:27:47.453999 trace.c:420             performance: 13.078537207 s: git command: 'git' 'checkout-index' '--worker'
> 16:27:48.986433 trace.c:420             performance: 14.612951643 s: git command: 'git' 'checkout-index' '--worker'
> 16:27:53.149378 trace.c:420             performance: 18.781762536 s: git command: 'git' 'checkout-index' '--worker'
> 16:27:54.884044 trace.c:420             performance: 20.514473730 s: git command: 'git' 'checkout-index' '--worker'
> 16:27:55.319990 trace.c:420             performance: 20.948326263 s: git command: 'git' 'checkout-index' '--worker'
> 16:27:55.863211 builtin/checkout.c:1299 performance: 22.723118420 s: checkout_paths
> 16:27:55.863398 trace.c:420             performance: 22.854547640 s: git command: 'git' 'checkout' '--parallel' '.'
>
> I suspect on nfs or windows, the gain may be higher due to IO blocking
> the main process more.
>
> Note that this for-fun patch is not optmized at all (and definitely
> not portable). I could have sent a group of paths to the worker in a
> single system call instead of one per call. The trace above also shows
> unbalance issues with workers, where some workers exit early because
> of my naive work distribution. Numbers could get a bit better.
>
> [1] http://thread.gmane.org/gmane.comp.version-control.git/241965/focus=242020

Would it make sense to use the parallel processing infrastructure from
run-command.h
instead of doing all setup and teardown yourself?
(As you call it for-fun patch, I'd assume the answer is: Writing code
is more fun than
using other peoples code ;)

>
> -- 8< --
> diff --git a/builtin/checkout-index.c b/builtin/checkout-index.c
> index 92c6967..7163216 100644
> --- a/builtin/checkout-index.c
> +++ b/builtin/checkout-index.c
> @@ -9,6 +9,7 @@
>  #include "quote.h"
>  #include "cache-tree.h"
>  #include "parse-options.h"
> +#include "entry.h"
>
>  #define CHECKOUT_ALL 4
>  static int nul_term_line;
> @@ -179,6 +180,9 @@ int cmd_checkout_index(int argc, const char **argv, const char *prefix)
>                 OPT_END()
>         };
>
> +       if (argc == 2 && !strcmp(argv[1], "--worker"))
> +               return parallel_checkout_worker();
> +
>         if (argc == 2 && !strcmp(argv[1], "-h"))
>                 usage_with_options(builtin_checkout_index_usage,
>                                    builtin_checkout_index_options);
> diff --git a/builtin/checkout.c b/builtin/checkout.c
> index efcbd8f..51caad2 100644
> --- a/builtin/checkout.c
> +++ b/builtin/checkout.c
> @@ -20,6 +20,7 @@
>  #include "resolve-undo.h"
>  #include "submodule-config.h"
>  #include "submodule.h"
> +#include "entry.h"
>
>  static const char * const checkout_usage[] = {
>         N_("git checkout [<options>] <branch>"),
> @@ -236,7 +237,8 @@ static int checkout_merged(int pos, struct checkout *state)
>  }
>
>  static int checkout_paths(const struct checkout_opts *opts,
> -                         const char *revision)
> +                         const char *revision,
> +                         int parallel)
>  {
>         int pos;
>         struct checkout state;
> @@ -357,6 +359,8 @@ static int checkout_paths(const struct checkout_opts *opts,
>         state.force = 1;
>         state.refresh_cache = 1;
>         state.istate = &the_index;
> +       if (parallel)
> +               start_parallel_checkout(&state);
>         for (pos = 0; pos < active_nr; pos++) {
>                 struct cache_entry *ce = active_cache[pos];
>                 if (ce->ce_flags & CE_MATCHED) {
> @@ -367,11 +371,18 @@ static int checkout_paths(const struct checkout_opts *opts,
>                         if (opts->writeout_stage)
>                                 errs |= checkout_stage(opts->writeout_stage, ce, pos, &state);
>                         else if (opts->merge)
> +                               /*
> +                                * XXX: in parallel mode, we may want
> +                                * to let worker perform the merging
> +                                * instead and send SHA-1 result back
> +                                */
>                                 errs |= checkout_merged(pos, &state);
>                         pos = skip_same_name(ce, pos) - 1;
>                 }
>         }
>
> +       errs |= run_parallel_checkout();
> +
>         if (write_locked_index(&the_index, lock_file, COMMIT_LOCK))
>                 die(_("unable to write new index file"));
>
> @@ -1132,6 +1143,7 @@ int cmd_checkout(int argc, const char **argv, const char *prefix)
>         struct branch_info new;
>         char *conflict_style = NULL;
>         int dwim_new_local_branch = 1;
> +       int parallel = 0;
>         struct option options[] = {
>                 OPT__QUIET(&opts.quiet, N_("suppress progress reporting")),
>                 OPT_STRING('b', NULL, &opts.new_branch, N_("branch"),
> @@ -1159,6 +1171,8 @@ int cmd_checkout(int argc, const char **argv, const char *prefix)
>                                 N_("second guess 'git checkout <no-such-branch>'")),
>                 OPT_BOOL(0, "ignore-other-worktrees", &opts.ignore_other_worktrees,
>                          N_("do not check if another worktree is holding the given ref")),
> +               OPT_BOOL(0, "parallel", &parallel,
> +                        N_("parallel checkout")),
>                 OPT_BOOL(0, "progress", &opts.show_progress, N_("force progress reporting")),
>                 OPT_END(),
>         };
> @@ -1279,8 +1293,12 @@ int cmd_checkout(int argc, const char **argv, const char *prefix)
>                 strbuf_release(&buf);
>         }
>
> -       if (opts.patch_mode || opts.pathspec.nr)
> -               return checkout_paths(&opts, new.name);
> +       if (opts.patch_mode || opts.pathspec.nr) {
> +               uint64_t start = getnanotime();
> +               int ret = checkout_paths(&opts, new.name, parallel);
> +               trace_performance_since(start, "checkout_paths");
> +               return ret;
> +       }
>         else
>                 return checkout_branch(&opts, &new);
>  }
> diff --git a/entry.c b/entry.c
> index a410957..5e0eb1c 100644
> --- a/entry.c
> +++ b/entry.c
> @@ -3,6 +3,36 @@
>  #include "dir.h"
>  #include "streaming.h"
>
> +#include <sys/epoll.h>
> +#include "pkt-line.h"
> +#include "argv-array.h"
> +#include "run-command.h"
> +
> +struct checkout_item {
> +       struct cache_entry *ce;
> +       struct checkout_item *next;
> +};
> +
> +struct checkout_worker {
> +       struct child_process cp;
> +       struct checkout_item *to_complete;
> +       struct checkout_item *to_send;
> +};
> +
> +struct parallel_checkout {
> +       struct checkout state;
> +       struct checkout_worker *workers;
> +       struct checkout_item *items;
> +       int nr_items, alloc_items;
> +       int nr_workers;
> +};
> +
> +static struct parallel_checkout *parallel_checkout;
> +
> +static int queue_checkout(struct parallel_checkout *,
> +                         const struct checkout *,
> +                         struct cache_entry *);
> +
>  static void create_directories(const char *path, int path_len,
>                                const struct checkout *state)
>  {
> @@ -290,5 +320,299 @@ int checkout_entry(struct cache_entry *ce,
>                 return 0;
>
>         create_directories(path.buf, path.len, state);
> +
> +       if (!queue_checkout(parallel_checkout, state, ce))
> +               /*
> +                * write_entry() will be done by parallel_checkout_worker() in
> +                * a separate process
> +                */
> +               return 0;
> +
>         return write_entry(ce, path.buf, state, 0);
>  }
> +
> +int start_parallel_checkout(const struct checkout *state)
> +{
> +       if (parallel_checkout)
> +               die("BUG: parallel checkout already initiated");
> +       if (0 && state->force)
> +               die("BUG: not support --force yet");
> +       parallel_checkout = xmalloc(sizeof(*parallel_checkout));
> +       memset(parallel_checkout, 0, sizeof(*parallel_checkout));
> +       memcpy(&parallel_checkout->state, state, sizeof(*state));
> +
> +       return 0;
> +}
> +
> +static int queue_checkout(struct parallel_checkout *pc,
> +                         const struct checkout *state,
> +                         struct cache_entry *ce)
> +{
> +       struct checkout_item *ci;
> +
> +       if (!pc ||
> +           !S_ISREG(ce->ce_mode) ||
> +           memcmp(&pc->state, state, sizeof(*state)))
> +               return -1;
> +
> +       ALLOC_GROW(pc->items, pc->nr_items + 1, pc->alloc_items);
> +       ci = pc->items + pc->nr_items++;
> +       ci->ce = ce;
> +       return 0;
> +}
> +
> +static int item_cmp(const void *a_, const void *b_)
> +{
> +       const struct checkout_item *a = a_;
> +       const struct checkout_item *b = b_;
> +       return strcmp(a->ce->name, b->ce->name);
> +}
> +
> +static int setup_workers(struct parallel_checkout *pc, int epoll_fd)
> +{
> +       int from, nr_per_worker, i;
> +
> +       pc->workers = xmalloc(sizeof(*pc->workers) * pc->nr_workers);
> +       memset(pc->workers, 0, sizeof(*pc->workers) * pc->nr_workers);
> +
> +       nr_per_worker = pc->nr_items / pc->nr_workers;
> +       from = 0;
> +
> +       for (i = 0; i < pc->nr_workers; i++) {
> +               struct checkout_worker *worker = pc->workers + i;
> +               struct child_process *cp = &worker->cp;
> +               struct checkout_item *item;
> +               struct epoll_event ev;
> +               int to;
> +
> +               to = from + nr_per_worker;
> +               if (i == pc->nr_workers - 1)
> +                       to = pc->nr_items;
> +               item = NULL;
> +               while (from < to) {
> +                       pc->items[from].next = item;
> +                       item = pc->items + from;
> +                       from++;
> +               }
> +               worker->to_send = item;
> +               worker->to_complete = item;
> +
> +               cp->git_cmd = 1;
> +               cp->in = -1;
> +               cp->out = -1;
> +               argv_array_push(&cp->args, "checkout-index");
> +               argv_array_push(&cp->args, "--worker");
> +               if (start_command(cp))
> +                       die(_("failed to run checkout worker"));
> +
> +               ev.events = EPOLLOUT | EPOLLERR | EPOLLHUP;
> +               ev.data.u32 = i * 2;
> +               if (epoll_ctl(epoll_fd, EPOLL_CTL_ADD, cp->in, &ev) == -1)
> +                       die_errno("epoll_ctl");
> +
> +               ev.events = EPOLLIN | EPOLLERR | EPOLLHUP;
> +               ev.data.u32 = i * 2 + 1;
> +               if (epoll_ctl(epoll_fd, EPOLL_CTL_ADD, cp->out, &ev) == -1)
> +                       die_errno("epoll_ctl");
> +       }
> +       return 0;
> +}
> +
> +static int send_to_worker(struct checkout_worker *worker, int epoll_fd)
> +{
> +       if (!worker->to_send) {
> +               struct epoll_event ev;
> +
> +               packet_flush(worker->cp.in);
> +
> +               epoll_ctl(epoll_fd, EPOLL_CTL_DEL, worker->cp.in, &ev);
> +               close(worker->cp.in);
> +               worker->cp.in = -1;
> +               return 0;
> +       }
> +
> +       /*
> +        * XXX: put the fd in non-blocking mode and send as many files
> +        * as possible in one go.
> +        */
> +       packet_write(worker->cp.in, "%s %s",
> +                    sha1_to_hex(worker->to_send->ce->sha1),
> +                    worker->to_send->ce->name);
> +       worker->to_send = worker->to_send->next;
> +       return 0;
> +}
> +
> +int parallel_checkout_worker(void)
> +{
> +       struct checkout state;
> +       struct cache_entry *ce = NULL;
> +
> +       memset(&state, 0, sizeof(state));
> +       /* FIXME: pass 'force' over */
> +       for (;;) {
> +               int len;
> +               unsigned char sha1[20];
> +               char *line = packet_read_line(0, &len);
> +
> +               if (!line)
> +                       return 0;
> +
> +               if (len < 40)
> +                       return 1;
> +               if (get_sha1_hex(line, sha1))
> +                       return 1;
> +               line += 40;
> +               len -= 40;
> +               if (*line != ' ')
> +                       return 1;
> +               line++;
> +               len--;
> +               if (!ce || ce_namelen(ce) < len) {
> +                       free(ce);
> +                       ce = xcalloc(1, cache_entry_size(len));
> +                       ce->ce_mode = S_IFREG | ce_permissions(0644);
> +               }
> +               ce->ce_namelen = len;
> +               hashcpy(ce->sha1, sha1);
> +               memcpy(ce->name, line, len + 1);
> +
> +               if (write_entry(ce, ce->name, &state, 0))
> +                       return 1;
> +               /*
> +                * XXX process in batch and send bigger number of
> +                * checked out entries back
> +                */
> +               packet_write(1, "1");
> +       }
> +}
> +
> +static int receive_from_worker(struct checkout_worker *worker,
> +                              int refresh_cache)
> +{
> +       int len, val;
> +       char *line;
> +
> +       line = packet_read_line(worker->cp.out, &len);
> +       val = atoi(line);
> +       if (val <= 0)
> +               die("BUG: invalid value");
> +       while (val && worker->to_complete &&
> +              worker->to_complete != worker->to_send) {
> +               if (refresh_cache) {
> +                       struct stat st;
> +                       struct cache_entry *ce = worker->to_complete->ce;
> +
> +                       lstat(ce->name, &st);
> +                       fill_stat_cache_info(ce, &st);
> +                       ce->ce_flags |= CE_UPDATE_IN_BASE;
> +               }
> +               worker->to_complete = worker->to_complete->next;
> +               val--;
> +       }
> +       if (val)
> +               die("BUG: invalid value");
> +       return 0;
> +}
> +
> +static int finish_worker(struct checkout_worker *worker, int epoll_fd)
> +{
> +       struct epoll_event ev;
> +       char buf[1];
> +       int ret;
> +
> +       assert(worker->to_send == NULL);
> +       assert(worker->to_complete == NULL);
> +
> +       ret = xread(worker->cp.out, buf, sizeof(buf));
> +       if (ret != 0)
> +               die("BUG: expect eof");
> +       epoll_ctl(epoll_fd, EPOLL_CTL_DEL, worker->cp.out, &ev);
> +       close(worker->cp.out);
> +       worker->cp.out = -1;
> +       if (finish_command(&worker->cp))
> +               die("worker had a problem");
> +       return 0;
> +}
> +
> +static int really_finished(struct parallel_checkout *pc)
> +{
> +       int i;
> +
> +       for (i = 0; i < pc->nr_workers; i++)
> +               if (pc->workers[i].to_complete)
> +                       return 0;
> +       return 1;
> +}
> +
> +/* XXX progress support for unpack-trees */
> +int run_parallel_checkout(void)
> +{
> +       struct parallel_checkout *pc = parallel_checkout;
> +       int ret, i;
> +       int epoll_fd;
> +       struct epoll_event *events;
> +
> +       if (!pc || !pc->nr_items) {
> +               free(pc);
> +               parallel_checkout = NULL;
> +               return 0;
> +       }
> +
> +       qsort(pc->items, pc->nr_items, sizeof(*pc->items), item_cmp);
> +       pc->nr_workers = 8;
> +       epoll_fd = epoll_create(pc->nr_workers * 2);
> +       if (epoll_fd == -1)
> +               die_errno("epoll_create");
> +       ret = setup_workers(pc, epoll_fd);
> +       events = xmalloc(sizeof(*events) * pc->nr_workers * 2);
> +
> +       ret = 0;
> +       while (!ret) {
> +               int maybe_all_done = 0, nr;
> +
> +               nr = epoll_wait(epoll_fd, events, pc->nr_workers * 2, -1);
> +               if (nr == -1 && errno == EINTR)
> +                       continue;
> +               if (nr == -1) {
> +                       ret = nr;
> +                       break;
> +               }
> +               for (i = 0; i < nr; i++) {
> +                       int is_in = events[i].data.u32 & 1;
> +                       int worker_id = events[i].data.u32 / 2;
> +                       struct checkout_worker *worker = pc->workers + worker_id;
> +
> +                       if (!is_in && (events[i].events & EPOLLOUT))
> +                               ret = send_to_worker(worker, epoll_fd);
> +                       else if (events[i].events & EPOLLIN) {
> +                               if (worker->to_complete) {
> +                                       int refresh = pc->state.refresh_cache;
> +                                       ret = receive_from_worker(worker, refresh);
> +                                       pc->state.istate->cache_changed |= CE_ENTRY_CHANGED;
> +                               } else {
> +                                       ret = finish_worker(worker, epoll_fd);
> +                                       maybe_all_done = 1;
> +                               }
> +                       } else if (events[i].events & (EPOLLERR | EPOLLHUP)) {
> +                               if (is_in && !worker->to_complete) {
> +                                       ret = finish_worker(worker, epoll_fd);
> +                                       maybe_all_done = 1;
> +                               } else
> +                                       ret = -1;
> +                       } else
> +                               die("BUG: what??");
> +                       if (ret)
> +                               break;
> +               }
> +
> +               if (maybe_all_done && really_finished(pc))
> +                       break;
> +       }
> +
> +       close(epoll_fd);
> +       free(pc->workers);
> +       free(events);
> +       free(pc);
> +       parallel_checkout = NULL;
> +       return ret;
> +}
> diff --git a/unpack-trees.c b/unpack-trees.c
> index 9f55cc2..433c54e 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -220,6 +220,7 @@ static int check_updates(struct unpack_trees_options *o)
>         remove_marked_cache_entries(&o->result);
>         remove_scheduled_dirs();
>
> +       /* start_parallel_checkout() */
>         for (i = 0; i < index->cache_nr; i++) {
>                 struct cache_entry *ce = index->cache[i];
>
> @@ -234,6 +235,7 @@ static int check_updates(struct unpack_trees_options *o)
>                         }
>                 }
>         }
> +       /* run_parallel_checkout() */
>         stop_progress(&progress);
>         if (o->update)
>                 git_attr_set_direction(GIT_ATTR_CHECKIN, NULL);
> -- 8< --
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html