RFC: Allow missing objects during packing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



So I have this horribly bizarre case that I just feel dirty telling
you about.  But I have to deal with it.  So here goes...

I want to publish a fork of an open source project, but I'm not
allowed to publish the base history of the project.  For political
reasons I can only publish patches/deltas, even though the project
is licensed under the LGPL.  Don't ask.  Seriously.  In 6 months
everyone will probably forget about this and I can just publish
the whole thing.  But not right now.

With this silly little patch to builtin-pack-objects I can make
it work by publishing what amounts to a shallow clone.  The fork
was created by doing something like this:

	$ git clone --bare git://upstream.org/proj.git upstream.git
	$ cd fork
	$ git remote add -f upstream ../upstream.git
	$ echo ../upstream.git/objects >.git/objects/info/alternates
	$ git repack -a -d -f -l

	# now fork/.git has only its "delta"
	# add the base trees, but not blobs
	#
	$ jgit --git-dir=../upstream.git tree-copy upstream/master .

	# remove the shared odb
	#
    $ rm .git/objects/info/alternates

Users who want to clone from fork.git have to do this:

	$ git clone -o upstream git://upstream.org/proj.git proj
	$ cd proj
	$ git remote add -f origin git://forky.org/fork.git
	remote: Counting objects: 5, done.
	remote: error: unable to find d30dcb7f07b6bf1a6cc5da4f3c1d0e6fc690dc45
	remote: git: unable to get type of object d30dremote: cb7f07b6bf1a6cc5da4f3c1d0e6fc690dc45: No such file or directory
	remote: Compressing objects: 100% (3/3), done.
	remote: Total 3 (delta 1), reused 0 (delta 0)
	Unpacking objects: 100% (3/3), done.

Basically this little patch just makes pack-objects issue the
warning, but then skip over the object and move on.

Later during deltification the selected base object has a type which
does not match any other object, so it never gets considered, and
there is no error caused by it not being present.  We just cannot
generate a delta during the fetch and the client winds up getting
a little larger download.

Thoughts?  Should I work this up into a real patch and try to get
it into the tree?  It seems pretty harmless to allow an object we
aren't going to transmit but that we want to use as a delta base
in a thin pack to be missing.  At worst we just get a little bit
more data transfer.

--8<--
diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
index 2dadec1..0f29f14 100644
--- a/builtin-pack-objects.c
+++ b/builtin-pack-objects.c
@@ -1097,7 +1097,7 @@ static void check_object(struct object_entry *entry)
 
 	entry->type = sha1_object_info(entry->idx.sha1, &entry->size);
 	if (entry->type < 0)
-		die("unable to get type of object %s",
+		warn("unable to get type of object %s",
 		    sha1_to_hex(entry->idx.sha1));
 }


And this is the Java program to build up the base tree, as otherwise
the thin pack creation code crashes out trying to build a list of
what the client already has/knows.  Since this is only the base
commit and its trees (no blobs, and no history beyond the base
commit) I should be able to get away with publishing it as part of
the fork.

--8<--
import org.kohsuke.args4j.Argument;
import org.spearce.jgit.lib.Constants;
import org.spearce.jgit.lib.ObjectId;
import org.spearce.jgit.lib.PackWriter;
import org.spearce.jgit.lib.Repository;
import org.spearce.jgit.lib.TextProgressMonitor;
import org.spearce.jgit.pgm.Command;
import org.spearce.jgit.pgm.TextBuiltin;
import org.spearce.jgit.revwalk.RevCommit;
import org.spearce.jgit.revwalk.RevObject;
import org.spearce.jgit.treewalk.TreeWalk;

import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileOutputStream;
import java.util.LinkedList;
import java.util.List;

@Command(name = "tree-copy", usage = "Copy a base commit and its trees")
public class TreeCopy extends TextBuiltin {
  @Argument(index = 0, required = true, metaVar = "commit", usage = "base commit")
  private RevCommit baseCommit;

  @Argument(index = 1, required = true, metaVar = "dir", usage = "destination repository")
  private File destGitDir;

  @Override
  protected void run() throws Exception {
    final List<RevObject> toCopy = new LinkedList<RevObject>();

    toCopy.add(baseCommit);
    toCopy.add(baseCommit.getTree());

    final TreeWalk tw = new TreeWalk(db);
    tw.reset();
    tw.addTree(baseCommit.getTree());

    while (tw.next()) {
      switch (tw.getFileMode(0).getObjectType()) {
        case Constants.OBJ_TREE:
          toCopy.add(argWalk.lookupTree(tw.getObjectId(0)));
          tw.enterSubtree();
          break;
        default:
          break;
      }
    }

    final Repository destdb = new Repository(destGitDir);
    final PackWriter packer = new PackWriter(db, new TextProgressMonitor());
    packer.preparePack(toCopy.iterator());
    final String packName = "pack-" + packer.computeName();
    final File packDir = new File(destdb.getDirectory(), "objects/pack");
    final File packPath = new File(packDir, packName + ".pack");
    final File idxPath = new File(packDir, packName + ".idx");

    if (packPath.exists() && idxPath.exists())
      throw die(packPath.getName() + " already exists.");

    {
      final BufferedOutputStream os =
          new BufferedOutputStream(new FileOutputStream(packPath));
      packer.writePack(os);
      os.close();
      packPath.setReadOnly();
    }
    {
      final BufferedOutputStream os =
          new BufferedOutputStream(new FileOutputStream(idxPath));
      packer.writeIndex(os);
      os.close();
      idxPath.setReadOnly();
    }
  }
}
 
-- 
Shawn.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux