On Mon, Oct 24 2022, Calvin Wan wrote: >> I may just be missing something, but doesn't "struct child_process" >> already have e.g. "no_stderr", "no_stdout" etc. that we can use? >> I.e. isn't this thing equivalent to running: >> >> your-command >/dev/null 2>/dev/null >> >> Which is what the non-parallel API already supports. >> >> Now, IIRC if you just set that in the "get_next_task" callback it won't >> work in the parallel API, or you'll block waiting for I/O that'll never >> come or whatever. >> >> But that'll be because the parallel interface currently only suppors a >> subset of the full "child_process" combination of options, and maybe it >> doesn't grok this. >> >> But if that's the case we should just extend the API to support >> "no_stdout", "no_stderr" etc., no? >> >> I.e. hypothetically the parallel one could support 100% of the "struct >> child_process" combination of options, we just haven't bothered yet. >> >> But I don't see why the parallel API should grow options that we already >> have in "struct child_process", instead we should set them there, and it >> should gradually learn to deal with them. >> >> I think it's also fine to have some basic sanity checks there, e.g. I >> could see how for something like this we don't want to support piping >> only some children to /dev/null but not others, and that it should be >> all or nothing (maybe it makes state management when we loop over them >> easier). >> >> Or again, maybe I'm missing something... > > Shouldn't the options that are set in "child_process" be abstracted away > from "parallel_processes"? In general yes, and no :) Our main interafce should probably be "just set these in the 'struct child_process' we hand you", but the parallel API might want to assert certain things about those settings, as some of them may conflict with its assumptions. > Setting "no_stdout", "no_stderr", etc. in a > "child_process" shouldn't imply that we still pass the stdout and stderr to > "parallel_processes" and then we send the output to "/dev/null". Sure, but if they're not producing any output because it's being piped to /dev/null how worthwhile is it to optimize that? We still can optimize it, but I still think the interface should just be the equivalent of: parallel -k -j100% 'sleep 0.0$RANDOM && echo {} >/dev/null' ::: {1..100} Whereas what you seem to be trying to implement is the equivalent of a: parallel -u -j100% 'sleep 0.0$RANDOM && echo {} ::: {1..100} >/dev/null Except as an option to the parallel API, but the end result seems to be equivalent. > That being said, I can understand the aversion to adding an option like > this that doesn't also add support for stdout and stderr. I can remove this > patch and instead reset the buffer inside of pipe_output and task_finished > in a later patch I'm not necessarily opposed to it, just puzzled about it, maybe I don't have the full picture. In general I highly recomend looking at whatever GNU parallel is doing, and seeing if new features in run-command.[ch] can map to that mental model. Our API is basically a small subset of its featureset, and I've found it useful both to steal ideas from there, and to test assumptions. E.g. "ungroup" is just a straight rip-off of the "--ungroup" option, it's also had to think about combining various options we don't have yet (but might want). In that case the supervisor API/parallel(1) needs to do something special, but for "I don't want output" it seems best to just do that at the worker level, i.e. equivalent to piping to /dev/null. Having a bias towards that approach also makes it easier to convert things to running in parallel, i.e. you just (mostly) keep your current "struct child_process", and don't need to find the equivalents in the parallel API.