Re: Problem with large files on different OSes

Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> · Thu, 28 May 2009 12:49:12 -0700 (PDT)

On Thu, 28 May 2009, Jeff King wrote:
> 
> > So my "fixed chunk" approach would be nice in that if you have this kind 
> > of "chunkblob" entry, in the tree (and index) it would literally be one 
> > entry, and look like that:
> > 
> >    100644 chunkblob <sha1>
> 
> But if I am understanding you correctly, you _are_ proposing to munge
> the git data structure here. Which means that pre-chunkblob trees will
> point to the raw blob, and then post-chunkblob trees will point to the
> chunked representation. And that means not being able to use the sha-1
> to see that they eventually point to the same content.

Yes. If we were to do this, and people have large chunks, then once you 
start using the chunkblob (for lack of a better word) model, you'll see 
the same object with two different SHA1's. But it's a one-time (and 
one-way - since once it's a chunkblob, older models can't touch it) thing, 
it can never cause any long-term confusion.

(We'll end up with something similar if somebody ever breaks SHA-1 enough 
for us to care - the logical way to handle it is likely to just accept the 
SHA512-160 object name "aliases")

			Linus
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html