Re: [PATCH v4 11/19] parallel-checkout: make it truly parallel

Matheus Tavares Bernardino <matheus.bernardino@xxxxxx> · Thu, 17 Dec 2020 12:00:31 -0300

Hi, Emily

On Wed, Dec 16, 2020 at 7:31 PM Emily Shaffer <emilyshaffer@xxxxxxxxxx> wrote:
>
> On Wed, Nov 04, 2020 at 05:33:10PM -0300, Matheus Tavares wrote:
> >
> > Use multiple worker processes to distribute the queued entries and call
> > write_checkout_item() in parallel for them. The items are distributed
> > uniformly in contiguous chunks. This minimizes the chances of two
> > workers writing to the same directory simultaneously, which could
> > affect performance due to lock contention in the kernel. Work stealing
> > (or any other format of re-distribution) is not implemented yet.
> >
>
> Only having done a quick skim, is there a reason that you are doing the
> workqueue handling from scratch rather than using
> run-command.h:run_processes_parallel()? The implementation you use and
> the implementation in run-command.c seem really similar to me.

TBH, I wasn't aware of run_process_parallel(). Thanks for bringing it
to my attention! :)

Hmm, I still have to look further into it, but I think parallel
checkout wouldn't be compatible with the run_process_parallel()
framework, in its current form.

The first obstacle would be that, IIUC, the framework only allows
callers to pass information to the child processes through argv,
right? I saw that you already have a patch lifting this limitation
[1], but the feed_pipe callback requires a strbuf and parallel
checkout communicates through pkt-line. This is important because,
otherwise, the workers wouldn't know where each parallel_checkout_item
finishes. (The size of each item is variable and, since we send binary
data, it might contain anything, even bytes normally used as delimiter
such as '\0' or '\n'.)

Another difficulty is that the framework combines the child processes'
stdout and stderr, dumping it to the foreground process' stderr.
Parallel checkout expects workers to print errors to stderr, but it
needs the stdout of each worker to get the results back. This is used
both to get stat() data from the workers and to advance the checkout
progress bar. I see that you've also sent a patch adding more
flexibility in this area [2], but I'm not sure if parallel checkout
could use it without pkt-lines and separating stdout from stderr
(although, of course, the latter could be implemented in the framework
as an optional feature).

Also, we might want to later improve the main-process<=>worker
protocol by adding work stealing. This might help in the workload
balancing while still minimizing the chances of having multiple
workers writing to the same part of the working tree (as would happen
with round-robin, for example). I already have a rough patch for this
[3], but it needs to be timed and tested further. With work stealing,
the protocol becomes a little more complex, so it might not be
suitable for the callback-style interface of run_process_parallel().

I'm not really sure... I guess parallel checkout could use
run_process_parallel(), but it seems to me that it would require a
good amount of work on the framework and parallel checkout itself. I
don't know if it would be worth it.

[1]: https://lore.kernel.org/git/20201205014607.1464119-15-emilyshaffer@xxxxxxxxxx/
[2]: https://lore.kernel.org/git/20201205014607.1464119-17-emilyshaffer@xxxxxxxxxx/
[3]: https://github.com/matheustavares/git/commit/a483df570a3cdd1165354388ea3c9fcc0ec66aaf