git bundle vs git rev-list

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello all –



I am working to create a wrapper around git bundle to  synchronize of
git repos via sneakernet from network ‘a’ to network ‘b’ transfer on a
fairly frequent basis (daily to weekly).   Network ‘b’ has a
gatekeeper who is persnickety about what content might end up on his
network. The gatekeeper wants to know about the content being
transferred.



I’ve come up with a scheme to list the final form of all files
included in the bundle in whole or in part, see the psuedo code below:



# BEGIN PSEUDOCODE

#Create the bundle
git bundle create out.bundle --all "--since=<last_bundle_date>"

#Get list of commits
included_commits = git rev-list --all "--since=<last_bundle_date>"


#For each commit, get the immediate parent(s), and find objects in its
parents' tree that are not in its tree
foreach commit in included_commits:
               #Get all blobs in this commit's tree, map blob to file name
               CommitBlobsMapToFilename = Process(git ls-tree -r commit)

               #Now find the parent commit(s)
               ParentCommits = git rev-list --parents -n 1 commit

               foreach parent in ParentCommits:
                              #Get all blobs in the parent's tree
                              ParentBlobsMapToFilename = Process(git
ls-tree -r parent)

                              #Find blobs in this commit's tree that
are not in the parent's commit tree
                              NewBlobs =
setdiff(CommitBlobsMapToFilename , ParentBlobsMapToFilename);

                              #Write each new blob contents to a unique filename
                              foreach blob in NewBlobs
                                             filename =
CommitBlobsMapToFilename(blob)
                                             filename = makeUnique(filename)
                                             git show blob > filename
 # END PSEUDOCODE


This scheme has worked well, but this is approach is predicated on the
assumption that

git bundle create  –all –since=<last_bundle>

uses the same commits that are returned by

git rev-list --all --since=<last_bundle>

However, I’ve noticed a scenario where that is not the case.  I create
a bundle using --since=yesterday, where no activity has been made
within the past few days.  As expected, 'git rev-list --all
--since=yesterday' returns 0 commits.  However, the command 'git
bundle create --all --since=yesterday' creates a bundle containing the
full history.

Tags seem to be the culprit, but I don’t know why. I do notice in the
output of git bundle that it mentions “skipping ref …” and “skipping
tag …”, and sure enough all branches and most tags are shown as being
skipped.  However there are a few tags that are missing from that
list.

If I use --branches rather than --all as the limiter, then all is
well.  In that case, git rev-list still returns 0 commits, and git
bundle reports that it is refusing to make an empty bundle, as
expected.

So after all that, I have a two questions:

1. Any thoughts on why a tag would be included by 'git bundle', when
'git rev-list' with the same arguments returns empty?

2. Is there a way to list commits contained in the bundle file itself?
 This seems like it would be more robust than trying to re-create the
commit list via 'git rev-list'.

Thanks,

Jesse
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]