Stefan Beller <sbeller@xxxxxxxxxx> writes: > I agree the commit 7087c5f3 (SQUASH??? on top of run-command: add an > asynchronous parallel child processor) makes sense; I arrived at the same > patch after adding in the feedback. I dunno. As I said, while a small part of that commit is necessary to be squashed into [6/14] (i.e. the "failed to start" bugfix and redefinition of the return value of start_one()), the remainder of the commit is primarily to illustrate possible future enhancements that the basic structure of your code must support, so that we can get the fundamentals right. Each of the actual "possible future enhancements" may or may not be a good idea and there is nothing that back it up. In such a vacuum, I'd prefer to leave them simple and avoid starting performance tuning prematurely. One thing that is sort-of backed up already by your "cap to 4" experiment is that some sort of slow-start ramping up is far better than letting thundering herd stampede, so I am OK if we kept that SPAWN_CAP part of the commit. But even then, we do not know if tying that cap to online_cpu() is a good idea. Neither of us have a good argument backed by data on it. It is tempting to imagine, when you have N cores on an otherwise idle box, the setting of SPAWN_CAP shouldn't make much difference to how well the first child process makes its initial progress as long as it does not exceed the number of idle cores N-1 you have at hand. But that assumes that the task is CPU bound and you have infinite memory bandwidth. Once the task needs a lot of disk bandwidth to make its initial progress, which certainly is the case for fetch, the first child that is spawned together with (SPAWN_CAP-1) other processes would be competing for the shared resource, and having more online_cpus() would not help you. If we are not doing analysis that takes into such factors (and it is way too premature for us to be tuning), even "online_cpu() - 1" is unnecessarily too complex than a hardcoded small number (say, "2", or even "1"). The same thing can be said for the output_timeout selection. "Do not get stuck for too long until we have fully ramped up. Do not spin too frequently when there is no more room for a new child" was something I came up out of thin air as an example of something we might want to do, and I did write such a code in that commit, but that was primarily done so that you can clearly see that a better design would be to allow the caller, i.e. the scheduling loop, specify output_timeout to buffer_stderr(), and to keep the latter a "dumb" helper that can be controlled by a more smart caller (as opposed to hiding such a logic in buffer_stderr() and have a "dumb" driver call it). The actual output_timeout computation logic is not well thought out---it may even turn out to be that we are better off if we lengthened the timeout before we have fully ramped up, to encourage the first process to produce some output before we give chance to other new processes to be spawned in the later round. So for that change, while I think adding that parameter to buffer_stderr() is something we would want to keep, I'd prefer to keep the caller simpler by always passing a hardcoded 100 in the initial version, before we start tuning. And I do not think we want to start tuning before building a solid foundation to tune. In short, if I were amending that SQUASH??? commit, I'd probably be making it do less, not more, than what it does, something along the line of the attached. run-command.c | 15 ++++----------- 1 file changed, 4 insertions(+), 11 deletions(-) diff --git a/run-command.c b/run-command.c index b6d8b39..829b6fe 100644 --- a/run-command.c +++ b/run-command.c @@ -1108,20 +1108,19 @@ static void pp_collect_finished(struct parallel_processes *pp) } -#define SPAWN_CAP (pp.max_processes + 1) /* spawn as many as possible */ - int run_processes_parallel(int n, void *data, get_next_task_fn get_next_task, start_failure_fn start_failure, return_value_fn return_value) { struct parallel_processes pp; - pp_init(&pp, n, data, get_next_task, start_failure, return_value); + pp_init(&pp, n, data, get_next_task, start_failure, return_value); while (1) { - int no_more_task, cnt, output_timeout; + int no_more_task, cnt, output_timeout = 100; + int spawn_cap = 2; - for (cnt = SPAWN_CAP, no_more_task = 0; + for (cnt = spawn_cap, no_more_task = 0; cnt && pp.nr_processes < pp.max_processes; cnt--) { if (!pp_start_one(&pp)) { @@ -1132,12 +1131,6 @@ int run_processes_parallel(int n, void *data, if (no_more_task && !pp.nr_processes) break; - if (!cnt) - output_timeout = 50; - else if (pp.nr_processes < pp.max_processes) - output_timeout = 100; - else - output_timeout = 1000; pp_buffer_stderr(&pp, output_timeout); pp_output(&pp); -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html