Am 05.11.2015 um 19:17 schrieb Stefan Beller: > strbuf_read_once can also operate on blocking file descriptors if we are > sure they are ready. The poll (2) command however makes sure this is the > case. > > Reading the manual for poll (2), there may be spurious returns indicating > readiness but that is for network sockets only. Pipes should be unaffected. > By having this patch, we rely on the correctness of poll to return > only pipes ready to read. > > This fixes compilation in Windows. It certainly does (but I haven't tested, yet). But parallel processes will not work because we do not have a sufficiently complete waitpid emulation, yet. (waitpid(-1, ...) is not implemented.) However, I think that the infrastructure can be simplified even further to a level that we do not need additional emulation on Windows. First let me say that I find it very questionable that the callbacks receive a struct child_process. This is an implementation detail. It is also an implementation detail that stderr of the children is read and buffered, and that the child's stdout is redirected to stderr. It should not be the task of the get_next_task callback to set the members no_stdin, stdout_to_stderr, and err of struct child_process. If you move that initialization to pp_start_one, you will notice rather sooner than later that the readable end of the file descriptor is never closed! Which makes me think: Other users of start_command/finish_command work such that they 1. request a pipe by setting .out = -1 2. start_command 3. read from .out until EOF 4. close .out 5. wait for the process with finish_command But the parallel_process infrastructure does not follow this pattern. It 1. requests a pipe by setting .err = -1 2. start_command 3. read from .err 4. wait for the process with waitpid (and forgets to close .err). EOF is not in the picture (but that is not essential). I suggest to change this such that we read from the children until EOF, mark them to be at their end of life, and then wait for them using finish_command (assuming that a process that closes stdout and stderr will die very soon if it is not already dead). Here is a prototype patch. Feel free to pick it up. It marks a process whose EOF we have found by setting .err to -1. It's probably better to extend the meaning of the in_use indicator for this purpose. This seems to work on Linux with test-run-command with sub-processes that produce 100k output each: ./test-run-command run-command-parallel 5 sh -c "printf \"%0100000d\n\" 999" although error handling would require some polishing according to t0061-run-command. diff --git a/run-command.c b/run-command.c index 51d078c..3e42299 100644 --- a/run-command.c +++ b/run-command.c @@ -977,7 +977,7 @@ static struct parallel_processes *pp_init(int n, for (i = 0; i < n; i++) { strbuf_init(&pp->children[i].err, 0); child_process_init(&pp->children[i].process); - pp->pfd[i].events = POLLIN; + pp->pfd[i].events = POLLIN|POLLHUP; pp->pfd[i].fd = -1; } sigchain_push_common(handle_children_on_signal); @@ -1061,11 +1061,17 @@ static void pp_buffer_stderr(struct parallel_processes *pp, int output_timeout) /* Buffer output from all pipes. */ for (i = 0; i < pp->max_processes; i++) { if (pp->children[i].in_use && - pp->pfd[i].revents & POLLIN) - if (strbuf_read_once(&pp->children[i].err, - pp->children[i].process.err, 0) < 0) + pp->pfd[i].revents & (POLLIN|POLLHUP)) { + int n = strbuf_read_once(&pp->children[i].err, + pp->children[i].process.err, 0); + if (n == 0) { + close(pp->children[i].process.err); + pp->children[i].process.err = -1; + } else if (n < 0) { if (errno != EAGAIN) die_errno("read"); + } + } } } @@ -1082,59 +1088,20 @@ static void pp_output(struct parallel_processes *pp) static int pp_collect_finished(struct parallel_processes *pp) { int i = 0; - pid_t pid; - int wait_status, code; + int code; int n = pp->max_processes; int result = 0; while (pp->nr_processes > 0) { - pid = waitpid(-1, &wait_status, WNOHANG); - if (pid == 0) - break; - - if (pid < 0) - die_errno("wait"); - for (i = 0; i < pp->max_processes; i++) if (pp->children[i].in_use && - pid == pp->children[i].process.pid) + pp->children[i].process.err == -1) break; + if (i == pp->max_processes) - die("BUG: found a child process we were not aware of"); - - if (strbuf_read(&pp->children[i].err, - pp->children[i].process.err, 0) < 0) - die_errno("strbuf_read"); - - if (WIFSIGNALED(wait_status)) { - code = WTERMSIG(wait_status); - if (!pp->shutdown && - code != SIGINT && code != SIGQUIT) - strbuf_addf(&pp->children[i].err, - "%s died of signal %d", - pp->children[i].process.argv[0], - code); - /* - * This return value is chosen so that code & 0xff - * mimics the exit code that a POSIX shell would report for - * a program that died from this signal. - */ - code += 128; - } else if (WIFEXITED(wait_status)) { - code = WEXITSTATUS(wait_status); - /* - * Convert special exit code when execvp failed. - */ - if (code == 127) { - code = -1; - errno = ENOENT; - } - } else { - strbuf_addf(&pp->children[i].err, - "waitpid is confused (%s)", - pp->children[i].process.argv[0]); - code = -1; - } + break; + + code = finish_command(&pp->children[i].process); if (pp->task_finished(code, &pp->children[i].process, &pp->children[i].err, pp->data, @@ -1144,7 +1111,6 @@ static int pp_collect_finished(struct parallel_processes *pp) pp->nr_processes--; pp->children[i].in_use = 0; pp->pfd[i].fd = -1; - child_process_clear(&pp->children[i].process); child_process_init(&pp->children[i].process); if (i != pp->output_owner) { -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html