On Fri, Aug 07, 2020 at 03:58:37AM -0400, Jeff King wrote: > On Fri, Aug 07, 2020 at 09:05:58AM +0200, Patrick Steinhardt wrote: > > > In order to not repeatedly search for the reference-transaction hook in > > case it's getting called multiple times, we use a caching mechanism to > > only call `find_hook()` once. What was missed though is that the return > > value of `find_hook()` actually comes from a static strbuf, which means > > it will get overwritten when calling `find_hook()` again. As a result, > > we may call the wrong hook with parameters of the reference-transaction > > hook. > > > > This scenario was spotted in the wild when executing a git-push(1) with > > multiple references, where there are interleaving calls to both the > > update and the reference-transaction hook. While initial calls to the > > reference-transaction hook work as expected, it will stop working after > > the next invocation of the update hook. The result is that we now start > > calling the update hook with parameters and stdin of the > > reference-transaction hook. > > That makes sense. I think of push as a single transaction, but that's > only if the caller sends the "atomic" capability. Otherwise get one per > ref. > > > diff --git a/refs.c b/refs.c > > index 2dd851fe81..17e515b288 100644 > > --- a/refs.c > > +++ b/refs.c > > @@ -2044,7 +2044,7 @@ static int run_transaction_hook(struct ref_transaction *transaction, > > if (hook == &hook_not_found) > > return ret; > > if (!hook) > > - hook = find_hook("reference-transaction"); > > + hook = xstrdup_or_null(find_hook("reference-transaction")); > > if (!hook) { > > hook = &hook_not_found; > > return ret; > > The fix here looks obviously correct, though I have to wonder if the > caching is even worth it. It's saving us an access() call, but we're > about to exec a whole sub-process. > > It's perhaps more justifiable when there isn't a hook (we're still just > paying that one access(), but it's proportionally bigger). I kind of > doubt it's measurable, though, since a ref write requires a bunch of > syscalls anyway. Yeah, this really was done to not have to pay a performance penalty if updating thousands of refs if no reference-transaction hook exists. E.g. if doing a non-atomic push of n reference, we'd have n calls to access(3P). See [1] for reference. I've just did another quick benchmark without the cache, and it still consistently shows a non-zero performance hit without it: Test pks-reftx-hook-interleaving no-cache -------------------------------------------------------------------------------- 1400.2: update-ref 2.82(2.13+0.81) 2.86(2.19+0.78) +1.4% 1400.3: update-ref --stdin 0.22(0.07+0.15) 0.22(0.07+0.15) +0.0% Patrick [1]: https://public-inbox.org/git/20200603165142.GA24049@syl.local/
Attachment:
signature.asc
Description: PGP signature