Re: If you would write git from scratch now, what would you change?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Nov 26, 2007 12:17 PM, Jakub Narebski <jnareb@xxxxxxxxx> wrote:
> On Mon, 26 Nov 2007, Dana How wrote:
> > On Nov 25, 2007 1:48 PM, Jakub Narebski <jnareb@xxxxxxxxx> wrote:
> >
> > > If you would write git from scratch now, from the beginning, without
> > > concerns for backwards compatibility, what would you change, or what
> > > would you want to have changed?
> >
> > Currently data can be quickly copied from pack to pack,
> > but data cannot be quickly copied blob->pack or pack->blob
> > (there was an alternate blob format that supported this,
> >  but it was deprecated).  Using the pack format for blobs
> > would fix this.  It would also mean blobs wouldn't need to
> > be uncompressed to get the blob type or size I believe.
>
> Could you do some benchmark for repository with your large objects
> as loose objects created with and without core.legacyHeaders (created
> with git pre 1.5.3), and as single blob packs, perhaps kept, with
> _undocumented_ (except for RelNotes) gitattribute delta unset for
> those files?
First of all,  this is a very reasonable request and what I should be doing.
Unfortunately,  I only have the cycles at the moment to point out this
issue,  which appears to be a problem from my perspective.

Currently,
a user who wants to publish some (large) files does the following:
git add (calls deflate)
git commit
git push (builds a pack to stdout, calling inflate and deflate on each blob).

So if the blob and pack formats were more similar (different blob format,
big blobs are singleton packs, etc) the zlib calls in git push go away.
The deflate call could be sped up by using 1 for compression level,
but it still takes time.

Another "solution" is to make each workgroup member's .git/objects
be a symlink to a tree with a lot of sticky bits and do some scripting.
(This means "git push" doesn't push any data and only alters stuff
 in .git/refs/heads on the server.)
I'm not entirely enthusiastic about this,  and when I mentioned it a while
ago it did cause some retching...

> From Documentation/RelNotes-1.5.3:
>
>   - We used to have core.legacyheaders configuration, when
>     set to false, allowed git to write loose objects in a format
>     that mimicks the format used by objects stored in packs.  It
>     turns out that this was not so useful.  Although we will
>     continue to read objects written in that format, we do not
>     honor that configuration anymore and create loose objects in
>     the legacy/traditional format.
>
>   - "pack-objects" honors "delta" attribute set in
>     .gitattributes.  It does not attempt to deltify blobs that
>     come from paths with delta attribute set to false.
>
>   - diff-delta code that is used for packing has been improved
>     to work better on big files.
>
> The last part is thanks to your comments, complaints and efforts, Dana.
Yes,  there have been some very useful improvements recently.

However,  I didn't actually push for the first "-" you list;
I was pushing for the "mimic" option even then
but some argument was presented to me against it,
to which I had no counter-argument until I understood git better later.

Thanks,
-- 
Dana L. How  danahow@xxxxxxxxx  +1 650 804 5991 cell
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux