Re: Creating objects manually and repack

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jon Smirl <jonsmirl@xxxxxxxxx> wrote:
> How about adding a flag to repack that simply says delete the objects
> when done with them? I'd still create all of the objects on disk.
> Repack would assume that they have at least been sorted by filename.
> So repack could read in object names until it sees a change in the
> file name, sort them by size, deltafy, write out the pack and then
> delete the objects from that batch. Then repeat this process for the
> next file name on stdin.
> 
> I'm making two assumptions, first that blocks from a deleted file
> don't get written to disk. And that by deleting the file the file
> system will use the same blocks over and over. If those assumptions
> are close to being true then the cache shouldn't thrash. They don't
> have to be totally true, close is good enough.
> 
> Of course eliminating the files all together will be even faster.

See the email I just sent you.  The only file being written is the
pack file that's being generated.  No temporary files, no temporary
inodes, no temporary blocks.  Only two passes over the data: one to
write it out and a second to generate the SHA1.  I do two passes
vs. keep it all in memory to prevent the program from blowing out
on extremely large inputs.

It may be possible to tweak git-pack-objects to get what you propose
above, but to be honest I think the git-fast-import I just sent
was easier, especially as it avoids the temporary loose object stage.

-- 
Shawn.
-
: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]