Re: Get the commits to be pushed accurately in pre-push hook

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2025-02-13 at 02:05:47, Jayce Cao wrote:
> My goal is to check the commits to be pushed in pre-push hook to see
> if they contain sensitive data or not.
> I have an assumption that those commits which already exist in remote
> repos have no need to check.

You will probably want to read
https://git-scm.com/docs/gitfaq#restrict-with-hooks.  It's very easy to
bypass the `pre-push` hook locally by using `--no-verify` without any
way to detect that, so assuming you want to have an effective control,
you'll want a different approach.   Note also that I don't believe
libgit2 or other library-based Git engines invoke hooks at all, which is
also going to lend itself to probably adopting a different approach.

> So I read the Git doc and pre-push.sample file, I know that if we push
> to a new branch that the remote does not have,
> $remote_oid weil be zero, so we need to examine all commits in this
> branch. We can run `git rev-list $local_oid` to
> get all commits to be examined.
> 
> But consider this case, if I'm developing a huge project which has
> millions of commits.
> I create a new branch (we call it feat/awesome-feat) based on the
> master branch on my local repo, and create three commits.
> Then I run the `git push --set-upstream origin feat/awesome-feat`
> command to push the three commits to the remote.
> But when the pre-push hook is called, `git rev-list $local_oid` will
> print millions of commits. The commits except the new three
> already exist in the remote repo. And the `git push` command will send
> data only in the new commits to the remote, instead of all
> history commits.
> 
> So I mean we've no idea which commits will be sent to the remote
> indeed in the pre-push hook when pushing to a new branch
> that the remote doesn't have. I found a workaround:
> * Run `git ls-remote -q -h` command to get the commits the remote has.
> * Run `git rev-list $local_oid ^$haves` command to get the commits to
> be pushed.($haves are the commits obtained from the previous step).
> 
> But this workaround seems to be stupid when the remote has many
> branches. I wonder if there is any better way to get the commits
> to be pushed accurately in the pre-push hook.

Git LFS has an optimization where it uses `git rev-list --not
--remotes=origin` (or whatever the remote is).  This excludes objects
reachable from remote-tracking refs for the origin in question.

However, this has some limitations.  For instance, if the remote is
specified as a URL and not a remote name, then there will never be any
remote-tracking branches, and this optimization cannot be used.
Notably, I believe EGit (and maybe JGit) _always_ specify the remote as
a URL and never as a remote name, so this will not work there.

You may wish to inspect that project's source code for more details.

I am not aware of a better way to do this, but as I mentioned above, you
may not want to do this at all.
-- 
brian m. carlson (they/them or he/him)
Toronto, Ontario, CA

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux