Re: git-fetching from a big repository is slow

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Dec 14, 2006, at 14:46, Shawn Pearce wrote:
And yet I get good delta compression on a number of ZIP formatted
files which don't get good additional zlib compression (<3%).
Doing the above would cause those packfiles to explode to about
10x their current size.

Yes, that's because for zip files each file in the archive is
compressed independently. Similar things might happen when
checking in uncompressed tar files with JPG's. The question
is whether you prefer bad time usage or bad space usage when
handling large binary blobs. Maybe we should use a faster,
less precise algorithm instead of giving up.

Still, I think doing anything based on filename is a mistake.
If we want to have a heuristic to prevent spending too much time
on deltifying large compressed files, the heuristic should be
based on content, not filename.

Maybe we could some "magic" as used by the file(1) command
that allows git to say a bit more about the content of blobs.
This could be used both for ordering files during deltification
and to determine wether to try deltification at all.

  -Geert


-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]