Re: [PATCH] Prevent megablobs from gunking up git packs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Johannes Schindelin <Johannes.Schindelin@xxxxxx> wrote:
> On Thu, 24 May 2007, david@xxxxxxx wrote:
> > On Thu, 24 May 2007, Shawn O. Pearce wrote:
> > 
> > > Now #3 is actually really important here.  Don't forget that we
> > > *just* disabled the fancy "new loose object format".  It doesn't
> > > exist.  We can read the packfile-like loose objects, but we cannot
> > > write them anymore.  So lets say we explode a megablob into a loose
> > > object, and its 800 MiB by itself.  Now we have to send that object
> > > to a client.  Yes, that's right, we must *RECOMPRESS* 800 MiB for
> > > no reason.  Not the best choice.  Maybe we shouldn't have deleted
> > > that packfile formatted loose object writer...
> > 
> > when did the object store get changed so that loose objects aren't
> > compressed?
> 
> That never happened. But we had a different file format for loose objects, 
> which was meant to make it easier to copy as-is into a pack. That file 
> format went away, since it was not as useful as we hoped.

That "different file format" thing was added exactly for this type
of problem.  Someone added a bunch of large blobs to their repository
and then spent a lot of time decompressing and recompressing them
during their next repack.

The reason that recompress must happen is the deflate stream in a
standard (aka legacy) loose object contains both the Git object
header and the raw data; in a packfile the Git object header is
stored external from the deflate stream.  The "different file format"
used the packfile format, allowing us to store the Git object header
external from the deflate stream.  That meant we could just copy
the raw bytes as-is from the loose object into the packfile.

So we still store loose objects compressed, its just that we can
no longer create loose objects that can be copied directly into
a packfile without recompression.  And that is sort of Dana's
problem here.  OK, not entirely, but whatever.

-- 
Shawn.
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux