On Wed, 13 Dec 2006, David Tweed wrote:
How big is the "metadata" or "bookeeping data" in git related to a commit? (Eg, "around x bytes per changed file"
or "around x bytes per file being tracked (whether changed in the commit or not)" )
[I'm trying to get a feel for, if I switched to git, how much overhead would come from having a cron job automatically doing
a snapshot every hour (if anything has changed), plus manual snapshots at points where I want to feel "safeguarded".
I'm currently using my own simple, hacked together system for combined versioning/backups that does
this. Using naive tools that don't account for wastes space due to disk block size effects the data being
tracked is currently just under 9 months of acitvity on 2016 filenames with
17457599 bytes of data (ie, compressed version of their contents at various times) and 7838546 bytes
is "metadata", ie, 30 percent of the stored data is metadata. This is in a format using 6 bytes to associate a single blob of
contents to a filename (whether changed since last snapshot or not).]
if nothing has changed it will take the space of the commit tag, as the tree
will remain the same (and you should be able to script detection of this case
and make it zero overhead)
if something has changed you will have the new tree and the changed object
in a tree each object is ~28 bytes (IIRC from what Linus mentioned in the last
week or two)
a loose object is compressed, and if you repack it will delta against prior
versions for even more space savings
look at the size of the mozilla tree and the kernel tree and you will see that
when packed git is about as efficiant as any other option you have (and more
efficiant than most)
David Lang
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html