On Thu, Nov 19, 2020 at 9:16 PM Junio C Hamano <gitster@xxxxxxxxx> wrote: > > "Orgad Shaneh via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes: > > > From: Orgad Shaneh <orgads@xxxxxxxxx> > > > > Let hooks receive user input if applicable. > > > > Closing stdin originates in f5bbc3225 (Port git commit to C, > > 2007). Looks like the original shell implementation did have > > stdin open. Not clear why the author chose to close it on > > the C port (maybe copy&paste). > > Please drop unsubstanciated guess in the parentheses. If anything, > we've learned during the discussion thread that it is a bad idea to > leave the standard input open when spawning hooks in general, and to > me it looks like a lot more plausible reason why we tightened the > interface when it was made from "only to run hooks for 'git commit'" > to an interface that more widely usable to run any hook. If we are > not going to record that finding in the log message to help people > to find out what we knew at the time when the commit was created, > then we shouldn't mislead them with "maybe copy&paste" that is not > backed by anything other than a hunch. Done > > Allow stdin only for commit-related hooks. Some of the other > > hooks pass their own input to the hook, so don't change them. > > Before this paragraph that gives orders to the code to "be like so", > the log message needs to explain why it is a good idea to make such > a change. Which hook benefits by being able to read the standard > input? Describe what becomes possible in terms of end-user visible > effects (i.e. "now reading standard input becomes possible for > pre-commit hook" is *not* an answer. What new things a pre-commit > hook that now can read from the standard input do for the end user?) > to justify why such a change is a good thing to have, before this > paragraph to justify why leaving the standard input open for hooks > run by "git commit" is a good idea and is a safe thing to do. Added justification. > Note that even "git commit" may compete for its standard input with > hooks. "git commit -F - <message" currently may read the message to > EOF before doing anything interesting like spawning a hook, but it > is not implausible that the reading of the message may want to > happen much later in a future codebase, at which point the hook may > end up stealing the beginning of the message by reading from the > standard input. So ideally, if we can find a way to selectively > close the standard input for the hooks if "git commit" itself uses > the standard input, that would be better than unconditionally > leaving it open. Good catch! I found that pre-commit is called before reading stdin, so -F - is broken if pre-commit reads the input. Not sure what's the best way to solve this. Should I pass the flag to run_commit_hook everywhere? Or maybe add a new opt-out flag for run_commit_hook, and pass 0 on most calls? > Let's reorder the patch hunks to see the bottom layer first, as the > callers are mostly the same. > > > diff --git a/run-command.h b/run-command.h > > index 6472b38bde..e6a850c6fe 100644 > > --- a/run-command.h > > +++ b/run-command.h > > @@ -201,11 +201,16 @@ int run_command(struct child_process *); > > */ > > const char *find_hook(const char *name); > > > > +#define RUN_HOOK_ALLOW_STDIN 1 > > + > > /** > > * Run a hook. > > - * The first argument is a pathname to an index file, or NULL > > - * if the hook uses the default index file or no index is needed. > > - * The second argument is the name of the hook. > > + * The first argument is an array of environment variables, or NULL > > + * if the hook uses the default environment and doesn't require > > + * additional variables. > > + * The second argument is zero or RUN_HOOK_ALLOW_STDIN, which enables > > + * stdin for the child process (the default is no_stdin). > > + * The third argument is the name of the hook. > > * The further arguments correspond to the hook arguments. > > * The last argument has to be NULL to terminate the arguments list. > > * If the hook does not exist or is not executable, the return > > @@ -215,8 +220,8 @@ const char *find_hook(const char *name); > > * On execution, .stdout_to_stderr and .no_stdin will be set. > > */ > > LAST_ARG_MUST_BE_NULL > > -int run_hook_le(const char *const *env, const char *name, ...); > > -int run_hook_ve(const char *const *env, const char *name, va_list args); > > +int run_hook_le(const char *const *env, int opt, const char *name, ...); > > +int run_hook_ve(const char *const *env, int opt, const char *name, va_list args); > > Is this new parameter meant to be used as an enum? When the > run_hook interface gets extended the next time and we want a new > option, is the option expected to be mutually incompatible with > allow-stdin? You found it :) > I suspect that it would make this a more useful API if this new > parameter is not used as an enum but as a collection of flag bits. > > If so, a few things must change in the above: > > - The description of the second parameter in the comment shouldn't > say "zero or RUN_HOOK_ALLOW_STDIN"; it should rather say "an > OR'ed collection of feature bits like RUN_HOOK_ALLOW_STDIN > defined above" Done, thanks. > - The second parameter should be 'unsigned flags', not 'int opt'. I copied from run_command_v_opt*, which have int opt for flags. Changed anyway. > It is my understanding that "git commit" only needs run_hook_ve() to > drive its hook scripts. Isn't it premature to touch run_hook_le(), > in which nobody wants to leave the standard input open while running > hooks? It _might_ be a better idea to allow users of _le() to do > the same eventually, but then perhaps it is a good idea to do so in > a separate step at the end, as "only to be complete" patch. That > is, the structure of the topic ought to be something like: > > - [PATCH 1/2] add the "unsigned flags" word to _ve(), assign the > RUN_HOOK_ALLOW_STDIN bit, and update commit.c::run_commit_hook() > to pass RUN_HOOK_ALLOW_STDIN to it. > > - [PATCH 2/2] after surveying the options "git commit" takes, find > out the condition where "git commit" itself would want to consume > the standard input (e.g. "commit -F -", there may be others), and > tell run_commit_hook() *not* to pass RUN_HOOK_ALLOW_STDIN when we > use the standard input ourselves (i.e. forbid hooks to read from > it). Accepted. Waiting for your feedback to implement this part. > - [PATCH 3/2] add the same "unsigned flags" word to _le(), and > teach all callers to pass 0, as a "just for completeness" step. > > Personally, I think we should stop at [2/2], and do not do [3/2], as > there is no real demonstrated use of the standard input for hooks. > Especially because users of the _le() interface includes programs > like receive-pack whose standard input should not be molested, I'd > feel safer not to see [3/2] done at all (for that matter, I'm not > happy with [1/2] unless it comes with [2/2], either). Agreed. > > + if (!(opt & RUN_HOOK_ALLOW_STDIN)) > > + hook.no_stdin = 1; > > OK, so you are using the parameter as a flag word after all. Then > "int opt" should definitely be "unsigned flags". And these two > lines would be more readable, when written like so: > > hook.no_stdin = !(flags & RUN_HOOK_ALLOW_STDIN); > > I would think. Done. > > hook.stdout_to_stderr = 1; > > hook.trace2_hook_name = name; > > > > return run_command(&hook); > > } > > > > -int run_hook_le(const char *const *env, const char *name, ...) > > +int run_hook_le(const char *const *env, int opt, const char *name, ...) > > { > > va_list args; > > int ret; > > > > va_start(args, name); > > - ret = run_hook_ve(env, name, args); > > + ret = run_hook_ve(env, opt, name, args); > > I'd rather not to see the function signature of _le() changed in > patches [1/2] and [2/2]; instead we can just pass hardcoded 0 from > here to the underlying _ve(). > > Now, what is left is individual commands that use the run_hook > interface. > > > diff --git a/commit.c b/commit.c > > index fe1fa3dc41..775019ec9d 100644 > > --- a/commit.c > > +++ b/commit.c > > @@ -1646,7 +1646,7 @@ int run_commit_hook(int editor_is_used, const char *index_file, > > strvec_push(&hook_env, "GIT_EDITOR=:"); > > > > va_start(args, name); > > - ret = run_hook_ve(hook_env.v, name, args); > > + ret = run_hook_ve(hook_env.v, RUN_HOOK_ALLOW_STDIN, name, args); > > va_end(args); > > strvec_clear(&hook_env); > > > > This is good for [1/2]. We should avoid "git commit" from competing > with hooks for its standard input by conditionally passing > ALLOW_STDIN from here---only when the program itself does not use > the standard input in [2/2]. > > > diff --git a/builtin/am.c b/builtin/am.c > > "git am <mbox" reads from the mailbox. "git am -i" interacts with > the end user via its stdin/stdout. There may be other situations > where the hooks should not touch the standard input. > > > diff --git a/builtin/checkout.c b/builtin/checkout.c > > I do not offhand think of a reason why "git checkout" would compete > with its hooks for the standard input. If we were to allow hooks to > read from the standard input, that should come as an independent > patch for each program after patch [3/2], I think. The ones I don't > mention below should never leave the standard input open for hooks. > > > diff --git a/builtin/clone.c b/builtin/clone.c > > diff --git a/builtin/gc.c b/builtin/gc.c > > diff --git a/builtin/merge.c b/builtin/merge.c > > diff --git a/builtin/rebase.c b/builtin/rebase.c > > diff --git a/reset.c b/reset.c > > Ditto. > > > > diff --git a/t/t7503-pre-commit-and-pre-merge-commit-hooks.sh b/t/t7503-pre-commit-and-pre-merge-commit-hooks.sh > > index b3485450a2..e915ffe546 100755 > > --- a/t/t7503-pre-commit-and-pre-merge-commit-hooks.sh > > +++ b/t/t7503-pre-commit-and-pre-merge-commit-hooks.sh > > @@ -7,6 +7,7 @@ test_description='pre-commit and pre-merge-commit hooks' > > HOOKDIR="$(git rev-parse --git-dir)/hooks" > > PRECOMMIT="$HOOKDIR/pre-commit" > > PREMERGE="$HOOKDIR/pre-merge-commit" > > +POSTCOMMIT="$HOOKDIR/post-commit" > > > > # Prepare sample scripts that write their $0 to actual_hooks > > test_expect_success 'sample script setup' ' > > @@ -28,11 +29,15 @@ test_expect_success 'sample script setup' ' > > echo $0 >>actual_hooks > > test $GIT_PREFIX = "success/" > > EOF > > - write_script "$HOOKDIR/check-author.sample" <<-\EOF > > + write_script "$HOOKDIR/check-author.sample" <<-\EOF && > > echo $0 >>actual_hooks > > test "$GIT_AUTHOR_NAME" = "New Author" && > > test "$GIT_AUTHOR_EMAIL" = "newauthor@xxxxxxxxxxx" > > EOF > > + write_script "$HOOKDIR/user-input.sample" <<-\EOF > > + ! read -r line || echo "$line" > hook_input > > + exit 0 > > Style (Documentation/CodingGuidelines) Fixed. > - Redirection operators should be written with space before, but no > space after them. In other words, write 'echo test >"$file"' > instead of 'echo test> $file' or 'echo test > $file'. > > So, when our "read" immediately hits EOF (or I/O error, but let's > not worry about that case for now), we leave hook_input file alone, > but otherwise we write the single line we read to hook_input, and > regardless of an error, we report success to the invoking "git > commit". Which makes sense. > > We report success even when we had trouble writing into hook_input > file, though. Perhaps you should lose the "exit 0" at the end? Removed. > > + EOF > > ' > > > > test_expect_success 'root commit' ' > > @@ -278,4 +283,34 @@ test_expect_success 'check the author in hook' ' > > test_cmp expected_hooks actual_hooks > > ' > > > > +test_expect_success 'with user input' ' > > + test_when_finished "rm -f \"$PRECOMMIT\" user_input hook_input" && > > + cp "$HOOKDIR/user-input.sample" "$PRECOMMIT" && > > + echo "user input" > user_input && > > Style (I won't repeat from here on). Fixed. > > + echo "more" >>file && > > + git add file && > > + git commit -m "more" < user_input && > > + test_cmp user_input hook_input > > +' > > This is probably a good place to also test > > git commit -F - <user_input > > and see what happens. Added a test (currently failing). > Thanks. Thank you! Your feedback is thorough and helpful. - Orgad