Re: Fetching everything in another bare repo

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 08, 2023 at 05:39:07PM -0500, Paul Smith wrote:

> I have a tool that wants to preserve every commit and never garbage
> collect (there are references that need to be maintained to older
> commits/branches that have been deleted).  This tool keeps its own bare
> clone, and disables all GC and maintenance on it.

OK. It's not clear to me if this archive repo retains the old
references, or if it simply has a bunch of unreachable objects.
That distinction will matter below.

> Unfortunately a month or so ago, by accident someone re-cloned the
> primary copy of the repo that everyone else uses as this bare clone,
> which lost the old history.

Oops. I take it from this that the repository _doesn't_ have all of the
references.  It just has unreachable objects.

Which makes sense. Git cannot store "foo/bar" if "foo" still exists, so
you'd eventually hit such a problem if you tried to keep all of the old
references.

> So now what I want to do is fetch the old data into the current bare
> clone (since the old clone doesn't have the newest stuff).  And, I need
> to be sure that all commits are pulled, and kept, and nothing is
> cleaned up.  I would also like any deleted branches to re-appear, but I
> don't want to change the location of any existing branches in the new
> repo.
> 
> Is it sufficient to run something like this:
> 
>   git fetch --no-auto-maintenance --no-auto-gc <path-to-old-clone>

That wouldn't grab the unreachable objects from the old clone, though
(again, assuming it has some that you care about).

I think you probably want to treat the objects and references
separately. It's safe to just copy all of the objects and packfiles from
the old clone into the new one. You'll have duplicates, but you should
be able to de-dup and get a single packfile with:

  git repack -ad --keep-unreachable

And then you can do any ref updates in the new repository (since it now
has all objects from both). You might want something like:

  # get the list of refs in both repositories
  git -C old-repo for-each-ref --format='%(refname)' >old
  git -C new-repo for-each-ref --format='%(refname)' >new

  # now find the refs that are only in the old one; for-each-ref
  # output is sorted, so we can just use comm
  comm -23 old new >missing-refs

  # now generate and apply commands to update those refs. You could
  # probably also use fetch here, but this is faster and we know we have
  # all of the objects.
  xargs git -C old-repo \
	for-each-repo --format='create %(refname) %(objectname)' \
	<missing-refs |
  git update-ref --stdin

(caveat executor; I just typed this into my email and didn't test it, so
there may be typos or small issues).

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux