Re: Creating objects manually and repack

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8/4/06, A Large Angry SCM <gitzilla@xxxxxxxxx> wrote:
Jon Smirl wrote:
> On 8/4/06, Linus Torvalds <torvalds@xxxxxxxx> wrote:
>> I'd suggest against it, but you can (and should) just repack often enough
>> that you shouldn't ever have gigabytes of objects "in flight". I'd have
>> expected that with a repack every few ten thousand files, and most files
>> being on the order of a few kB, you'd have been more than ok, but
>> especially if you have large files, you may want to make things "every
>> <n>
>> bytes" rather than "every <n> files".
>
> How about forking off a pack-objects and handing it one file name at a
> time over a pipe. When I hand it the next file name I delete the first
> file. Does pack-objects make multiple passes over the files? This
> model would let me hand it all 1M files.
>

Why don't you just write the pack file directly? Pack files without
deltas have a very simple structure, and git-index-pack will create a
pack index file for the pack file you give it.

That is under consideration but the undeltafied pack is about 12GB and
it takes forever (about a day) to deltafy it. I'm not convinced yet
that an undeltafied pack is any faster than just having the objects in
the directories.

The same data in a deltafied pack is 700MB. That is a tremendous
difference in the amount of IO needed. The strategy has to be to avoid
IO, nothing I am doing is ever CPU bound.

--
Jon Smirl
jonsmirl@xxxxxxxxx
-
: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]