Re: [PATCH] add post-fetch hook

Junio C Hamano <gitster@xxxxxxxxx> · Sat, 24 Dec 2011 19:13:37 -0800

Joey Hess <joey@xxxxxxxxxxx> writes:

> The post-fetch hook is fed on its stdin all refs that were newly fetched.
> It is not allowed to abort the fetch (or pull), but can modify what
> was fetched or take other actions.
>
> One example use of this hook is to automatically merge certain remote
> branches into a local branch. Another is to update a local cache
> (such as a search index) with the fetched refs. No other hook is run
> near fetch time, except for post-merge, which doesn't always run after a
> fetch, which is why this additional hook is useful.
>
> Signed-off-by: Joey Hess <joey@xxxxxxxxxxx>

A typical "'git pull' invokes 'git fetch' and then lets 'git merge' (or
'git rebase') integrate the work on the current branch with what was
fetched" sequence goes like this:

 - 'git fetch':
  . grabs the necessary objects from the remote;
  . decides what remote tracking branches are updated to point
    at what objects, and what updates are to be denied;
  . updates remote tracking branches accordingly; and
  . writes $GIT_DIR/FETCH_HEAD to communicate what have been fetched
    and what are to be merged.

 - 'git merge':
  . reads $GIT_DIR/FETCH_HEAD to learn what commits to be merged; and
  . merges the commits to the current branch.

Even though we do not add arbitrary hooks on the client side that could
easily be implemented by wrapping the client side commands (i.e. you could
implement "git myfetch" that runs "git fetch" followed by whatever script
that mucks with the result of the fetch) in general, I can see that it
would be useful to have a hook that can tweak the result of the fetch run
inside of the "git pull", because you cannot tell "git pull" to run "git
myfetch" instead of "git fetch".

Because the sequence of "git fetch" followed by "git merge", both commands
issued by the end user, should be equivalent to "git pull" from an end
user's point of view, the hook must be called from near the end of "git
fetch" if we were to have such a hook that tweaks the result of the fetch
inside "pull". IOW, the implementation, even though logically it belongs
to "pull", has to be inside "fetch", not "pull".

In that sense, I am not fundamentally opposed to the idea of adding a post
fetch hook that allows tweaking of the result.

*HOWEVER*

If we _were_ to sanction the use of the hook to tweak the result, I do not
want to see it implemented as an ad-hoc hack that tells the hook writers
that it is _entirely_ their responsiblity to update the remote tracking
branches from what it fetched, and also update $GIT_DIR/FETCH_HEAD to
maintain consistency between these two places.

A very cursory look at the patch tells me that there are a few problems
with it.  It does not seem to affect what will go to $GIT_DIR/FETCH_HEAD
at all, and hence it does not have any way to affect the result of the
fetch that does not store it to any of our remote tracking branches.

> The #1 point of confusion for git-annex users is the need to run
> "git annex merge" after fetching. That does a union merge of newly
> fetched remote git-annex branches into the local git-annex branch.

That use case sounds like that "git fetch" is called as a first class UI,
which is covered by "git myfetch" (you can call it "git annex fetch")
wrapper approach, the canonical example of a hook that we explicitly do
not want to add. It also does not seem to call for mucking with the result
of the fetch at all.

Perhaps the two concepts should be separated into different hooks?

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html