Hi, Emily On Wed, Dec 16, 2020 at 7:31 PM Emily Shaffer <emilyshaffer@xxxxxxxxxx> wrote: > > On Wed, Nov 04, 2020 at 05:33:10PM -0300, Matheus Tavares wrote: > > > > Use multiple worker processes to distribute the queued entries and call > > write_checkout_item() in parallel for them. The items are distributed > > uniformly in contiguous chunks. This minimizes the chances of two > > workers writing to the same directory simultaneously, which could > > affect performance due to lock contention in the kernel. Work stealing > > (or any other format of re-distribution) is not implemented yet. > > > > Only having done a quick skim, is there a reason that you are doing the > workqueue handling from scratch rather than using > run-command.h:run_processes_parallel()? The implementation you use and > the implementation in run-command.c seem really similar to me. TBH, I wasn't aware of run_process_parallel(). Thanks for bringing it to my attention! :) Hmm, I still have to look further into it, but I think parallel checkout wouldn't be compatible with the run_process_parallel() framework, in its current form. The first obstacle would be that, IIUC, the framework only allows callers to pass information to the child processes through argv, right? I saw that you already have a patch lifting this limitation [1], but the feed_pipe callback requires a strbuf and parallel checkout communicates through pkt-line. This is important because, otherwise, the workers wouldn't know where each parallel_checkout_item finishes. (The size of each item is variable and, since we send binary data, it might contain anything, even bytes normally used as delimiter such as '\0' or '\n'.) Another difficulty is that the framework combines the child processes' stdout and stderr, dumping it to the foreground process' stderr. Parallel checkout expects workers to print errors to stderr, but it needs the stdout of each worker to get the results back. This is used both to get stat() data from the workers and to advance the checkout progress bar. I see that you've also sent a patch adding more flexibility in this area [2], but I'm not sure if parallel checkout could use it without pkt-lines and separating stdout from stderr (although, of course, the latter could be implemented in the framework as an optional feature). Also, we might want to later improve the main-process<=>worker protocol by adding work stealing. This might help in the workload balancing while still minimizing the chances of having multiple workers writing to the same part of the working tree (as would happen with round-robin, for example). I already have a rough patch for this [3], but it needs to be timed and tested further. With work stealing, the protocol becomes a little more complex, so it might not be suitable for the callback-style interface of run_process_parallel(). I'm not really sure... I guess parallel checkout could use run_process_parallel(), but it seems to me that it would require a good amount of work on the framework and parallel checkout itself. I don't know if it would be worth it. [1]: https://lore.kernel.org/git/20201205014607.1464119-15-emilyshaffer@xxxxxxxxxx/ [2]: https://lore.kernel.org/git/20201205014607.1464119-17-emilyshaffer@xxxxxxxxxx/ [3]: https://github.com/matheustavares/git/commit/a483df570a3cdd1165354388ea3c9fcc0ec66aaf