Re: [PATCH 11/19] pack-objects: use bitmaps when packing objects

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Oct 25, 2013 at 02:14:11PM +0000, Shawn O. Pearce wrote:

> On Thu, Oct 24, 2013 at 6:04 PM, Jeff King <peff@xxxxxxxx> wrote:
> > For bitmaps to be used, the following must be true:
> >
> >   1. We must be packing to stdout (as a normal `pack-objects` from
> >      `upload-pack` would do).
> >
> >   2. There must be a .bitmap index containing at least one of the
> >      "have" objects that the client is asking for.
> 
> The client must explicitly "have" a commit that has a bitmap? In JGit
> we allow the client to have anything, and walk backwards using
> traditional graph traversal until a bitmap is found.

If the bitmaps contain the full set of reachable objects and the client
does not have any "haves" that are bitmapped , then we know that either:

  1. Their "haves" are not reachable from the "wants"

     or

  2. Their "wants" are not bitmapped, and so the slice of "haves..wants"
     has no bitmaps

Since (1) is relatively rare, I think we are using this as a proxy for
(2), so that we can do a regular walk rather than looking around for
bitmaps that don't exist. But I may be misremembering the reasoning.
Vicent?

> > @@ -704,6 +759,18 @@ static void write_pack_file(void)
> >                 offset = write_pack_header(f, nr_remaining);
> >                 if (!offset)
> >                         die_errno("unable to write pack header");
> > +
> > +               if (reuse_packfile) {
> > +                       off_t packfile_size;
> > +                       assert(pack_to_stdout);
> > +
> > +                       packfile_size = write_reused_pack(f);
> > +                       if (!packfile_size)
> > +                               die_errno("failed to re-use existing pack");
> > +
> > +                       offset += packfile_size;
> > +               }
> > +
> >                 nr_written = 0;
> >                 for (; i < to_pack.nr_objects; i++) {
> >                         struct object_entry *e = write_order[i];
> 
> Can reuse_packfile be true at the same time as to_pack.nr_objects > 0?

Yes, if there are new, non-bitmapped objects to send on top of the
reused packfile.

> In JGit we write the to_pack list first, then the reuse pack. Our
> rationale was the to_pack list is recent objects that are newer and
> would appear first in a traditional traversal, so they should go at
> the front of the stream. This does mean if they delta compress against
> an object in that reuse_packfile slice they have to use REF_DELTA
> instead of OFS_DELTA.

That's a good point. In our case I think we do not delta against the
reused packfile objects at all, as we simply send out the whole slice of
packfile without making an entry for each object.

> Is this series running on github.com/torvalds/linux? Last Saturday I
> ran a live demo clone comparing github.com/torvalds/linux to a JGit
> bitmap clone and some guy heckled me because GitHub was only a few
> seconds slower. :-)

It is. Use kernel.org if you want to make fun of someone. :)

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]