"Orgad Shaneh via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes: > From: Orgad Shaneh <orgads@xxxxxxxxx> > > Let hooks receive user input if applicable. > > Closing stdin originates in f5bbc3225 (Port git commit to C, > 2007). Looks like the original shell implementation did have > stdin open. Not clear why the author chose to close it on > the C port (maybe copy&paste). Please drop unsubstanciated guess in the parentheses. If anything, we've learned during the discussion thread that it is a bad idea to leave the standard input open when spawning hooks in general, and to me it looks like a lot more plausible reason why we tightened the interface when it was made from "only to run hooks for 'git commit'" to an interface that more widely usable to run any hook. If we are not going to record that finding in the log message to help people to find out what we knew at the time when the commit was created, then we shouldn't mislead them with "maybe copy&paste" that is not backed by anything other than a hunch. > Allow stdin only for commit-related hooks. Some of the other > hooks pass their own input to the hook, so don't change them. Before this paragraph that gives orders to the code to "be like so", the log message needs to explain why it is a good idea to make such a change. Which hook benefits by being able to read the standard input? Describe what becomes possible in terms of end-user visible effects (i.e. "now reading standard input becomes possible for pre-commit hook" is *not* an answer. What new things a pre-commit hook that now can read from the standard input do for the end user?) to justify why such a change is a good thing to have, before this paragraph to justify why leaving the standard input open for hooks run by "git commit" is a good idea and is a safe thing to do. Note that even "git commit" may compete for its standard input with hooks. "git commit -F - <message" currently may read the message to EOF before doing anything interesting like spawning a hook, but it is not implausible that the reading of the message may want to happen much later in a future codebase, at which point the hook may end up stealing the beginning of the message by reading from the standard input. So ideally, if we can find a way to selectively close the standard input for the hooks if "git commit" itself uses the standard input, that would be better than unconditionally leaving it open. Let's reorder the patch hunks to see the bottom layer first, as the callers are mostly the same. > diff --git a/run-command.h b/run-command.h > index 6472b38bde..e6a850c6fe 100644 > --- a/run-command.h > +++ b/run-command.h > @@ -201,11 +201,16 @@ int run_command(struct child_process *); > */ > const char *find_hook(const char *name); > > +#define RUN_HOOK_ALLOW_STDIN 1 > + > /** > * Run a hook. > - * The first argument is a pathname to an index file, or NULL > - * if the hook uses the default index file or no index is needed. > - * The second argument is the name of the hook. > + * The first argument is an array of environment variables, or NULL > + * if the hook uses the default environment and doesn't require > + * additional variables. > + * The second argument is zero or RUN_HOOK_ALLOW_STDIN, which enables > + * stdin for the child process (the default is no_stdin). > + * The third argument is the name of the hook. > * The further arguments correspond to the hook arguments. > * The last argument has to be NULL to terminate the arguments list. > * If the hook does not exist or is not executable, the return > @@ -215,8 +220,8 @@ const char *find_hook(const char *name); > * On execution, .stdout_to_stderr and .no_stdin will be set. > */ > LAST_ARG_MUST_BE_NULL > -int run_hook_le(const char *const *env, const char *name, ...); > -int run_hook_ve(const char *const *env, const char *name, va_list args); > +int run_hook_le(const char *const *env, int opt, const char *name, ...); > +int run_hook_ve(const char *const *env, int opt, const char *name, va_list args); Is this new parameter meant to be used as an enum? When the run_hook interface gets extended the next time and we want a new option, is the option expected to be mutually incompatible with allow-stdin? I suspect that it would make this a more useful API if this new parameter is not used as an enum but as a collection of flag bits. If so, a few things must change in the above: - The description of the second parameter in the comment shouldn't say "zero or RUN_HOOK_ALLOW_STDIN"; it should rather say "an OR'ed collection of feature bits like RUN_HOOK_ALLOW_STDIN defined above" - The second parameter should be 'unsigned flags', not 'int opt'. It is my understanding that "git commit" only needs run_hook_ve() to drive its hook scripts. Isn't it premature to touch run_hook_le(), in which nobody wants to leave the standard input open while running hooks? It _might_ be a better idea to allow users of _le() to do the same eventually, but then perhaps it is a good idea to do so in a separate step at the end, as "only to be complete" patch. That is, the structure of the topic ought to be something like: - [PATCH 1/2] add the "unsigned flags" word to _ve(), assign the RUN_HOOK_ALLOW_STDIN bit, and update commit.c::run_commit_hook() to pass RUN_HOOK_ALLOW_STDIN to it. - [PATCH 2/2] after surveying the options "git commit" takes, find out the condition where "git commit" itself would want to consume the standard input (e.g. "commit -F -", there may be others), and tell run_commit_hook() *not* to pass RUN_HOOK_ALLOW_STDIN when we use the standard input ourselves (i.e. forbid hooks to read from it). - [PATCH 3/2] add the same "unsigned flags" word to _le(), and teach all callers to pass 0, as a "just for completeness" step. Personally, I think we should stop at [2/2], and do not do [3/2], as there is no real demonstrated use of the standard input for hooks. Especially because users of the _le() interface includes programs like receive-pack whose standard input should not be molested, I'd feel safer not to see [3/2] done at all (for that matter, I'm not happy with [1/2] unless it comes with [2/2], either). > diff --git a/run-command.c b/run-command.c > index 2ee59acdc8..21b1f0a5e9 100644 > --- a/run-command.c > +++ b/run-command.c > @@ -1343,7 +1343,7 @@ const char *find_hook(const char *name) > return path.buf; > } > > -int run_hook_ve(const char *const *env, const char *name, va_list args) > +int run_hook_ve(const char *const *env, int opt, const char *name, va_list args) > { > struct child_process hook = CHILD_PROCESS_INIT; > const char *p; > @@ -1356,20 +1356,21 @@ int run_hook_ve(const char *const *env, const char *name, va_list args) > while ((p = va_arg(args, const char *))) > strvec_push(&hook.args, p); > hook.env = env; > - hook.no_stdin = 1; > + if (!(opt & RUN_HOOK_ALLOW_STDIN)) > + hook.no_stdin = 1; OK, so you are using the parameter as a flag word after all. Then "int opt" should definitely be "unsigned flags". And these two lines would be more readable, when written like so: hook.no_stdin = !(flags & RUN_HOOK_ALLOW_STDIN); I would think. > hook.stdout_to_stderr = 1; > hook.trace2_hook_name = name; > > return run_command(&hook); > } > > -int run_hook_le(const char *const *env, const char *name, ...) > +int run_hook_le(const char *const *env, int opt, const char *name, ...) > { > va_list args; > int ret; > > va_start(args, name); > - ret = run_hook_ve(env, name, args); > + ret = run_hook_ve(env, opt, name, args); I'd rather not to see the function signature of _le() changed in patches [1/2] and [2/2]; instead we can just pass hardcoded 0 from here to the underlying _ve(). Now, what is left is individual commands that use the run_hook interface. > diff --git a/commit.c b/commit.c > index fe1fa3dc41..775019ec9d 100644 > --- a/commit.c > +++ b/commit.c > @@ -1646,7 +1646,7 @@ int run_commit_hook(int editor_is_used, const char *index_file, > strvec_push(&hook_env, "GIT_EDITOR=:"); > > va_start(args, name); > - ret = run_hook_ve(hook_env.v, name, args); > + ret = run_hook_ve(hook_env.v, RUN_HOOK_ALLOW_STDIN, name, args); > va_end(args); > strvec_clear(&hook_env); > This is good for [1/2]. We should avoid "git commit" from competing with hooks for its standard input by conditionally passing ALLOW_STDIN from here---only when the program itself does not use the standard input in [2/2]. > diff --git a/builtin/am.c b/builtin/am.c "git am <mbox" reads from the mailbox. "git am -i" interacts with the end user via its stdin/stdout. There may be other situations where the hooks should not touch the standard input. > diff --git a/builtin/checkout.c b/builtin/checkout.c I do not offhand think of a reason why "git checkout" would compete with its hooks for the standard input. If we were to allow hooks to read from the standard input, that should come as an independent patch for each program after patch [3/2], I think. The ones I don't mention below should never leave the standard input open for hooks. > diff --git a/builtin/clone.c b/builtin/clone.c > diff --git a/builtin/gc.c b/builtin/gc.c > diff --git a/builtin/merge.c b/builtin/merge.c > diff --git a/builtin/rebase.c b/builtin/rebase.c > diff --git a/reset.c b/reset.c Ditto. > diff --git a/t/t7503-pre-commit-and-pre-merge-commit-hooks.sh b/t/t7503-pre-commit-and-pre-merge-commit-hooks.sh > index b3485450a2..e915ffe546 100755 > --- a/t/t7503-pre-commit-and-pre-merge-commit-hooks.sh > +++ b/t/t7503-pre-commit-and-pre-merge-commit-hooks.sh > @@ -7,6 +7,7 @@ test_description='pre-commit and pre-merge-commit hooks' > HOOKDIR="$(git rev-parse --git-dir)/hooks" > PRECOMMIT="$HOOKDIR/pre-commit" > PREMERGE="$HOOKDIR/pre-merge-commit" > +POSTCOMMIT="$HOOKDIR/post-commit" > > # Prepare sample scripts that write their $0 to actual_hooks > test_expect_success 'sample script setup' ' > @@ -28,11 +29,15 @@ test_expect_success 'sample script setup' ' > echo $0 >>actual_hooks > test $GIT_PREFIX = "success/" > EOF > - write_script "$HOOKDIR/check-author.sample" <<-\EOF > + write_script "$HOOKDIR/check-author.sample" <<-\EOF && > echo $0 >>actual_hooks > test "$GIT_AUTHOR_NAME" = "New Author" && > test "$GIT_AUTHOR_EMAIL" = "newauthor@xxxxxxxxxxx" > EOF > + write_script "$HOOKDIR/user-input.sample" <<-\EOF > + ! read -r line || echo "$line" > hook_input > + exit 0 Style (Documentation/CodingGuidelines) - Redirection operators should be written with space before, but no space after them. In other words, write 'echo test >"$file"' instead of 'echo test> $file' or 'echo test > $file'. So, when our "read" immediately hits EOF (or I/O error, but let's not worry about that case for now), we leave hook_input file alone, but otherwise we write the single line we read to hook_input, and regardless of an error, we report success to the invoking "git commit". Which makes sense. We report success even when we had trouble writing into hook_input file, though. Perhaps you should lose the "exit 0" at the end? > + EOF > ' > > test_expect_success 'root commit' ' > @@ -278,4 +283,34 @@ test_expect_success 'check the author in hook' ' > test_cmp expected_hooks actual_hooks > ' > > +test_expect_success 'with user input' ' > + test_when_finished "rm -f \"$PRECOMMIT\" user_input hook_input" && > + cp "$HOOKDIR/user-input.sample" "$PRECOMMIT" && > + echo "user input" > user_input && Style (I won't repeat from here on). > + echo "more" >>file && > + git add file && > + git commit -m "more" < user_input && > + test_cmp user_input hook_input > +' This is probably a good place to also test git commit -F - <user_input and see what happens. Thanks.