On Wed, Nov 20, 2013 at 04:33:50PM -0400, Joey Hess wrote: > I've got a git repository of < 2 mb, where git wants to > allocate a rather insane amount of memory: > > >git fsck > Checking object directories: 100% (256/256), done. > fatal: Out of memory, malloc failed (tried to allocate 124865231165 bytes) > > > git show 11644b5a075dc1425e01fbba51c045cea2d0c408 > fatal: Out of memory, malloc failed (tried to allocate 124865231165 bytes) > > The problem seems to be the attached object file, which has gotten > corrupted, presumably in the header that git reads to see how large it > is. Thought I'd report this in case there is some easy way to > add a sanity check. Definitely a corrupt object. The start is not a valid zlib header, so we guess that it is an "experimental loose object". This is a format that git wrote for very short period as a performance experiment; it didn't pan out and we no longer write it. The loose object format contains the (purported) object size outside of the checksum'd zlib data (whereas the normal format has a human-readable header that gets zlib'd). Your corrupted bytes end up specifying a ridiculously large size. I wonder if it is time to drop reading support for the experimental objects. It was never widely used, and was deprecated in v1.5.2 by 726f852 (deprecate the new loose object header format, 2007-05-09). That would improve the case when the initial bytes of a loose object are corrupted, because we would complain about the bogus zlib data before trying to allocate the buffer. The problem would still remain for packfiles, which use a similar encoding, but I suspect it is less common there. For a single-byte corruption, it is unlikely to be right in the length header. But for absolute junk that is not git data at all, the first bytes are very likely to be corrupted. In the pack case, we would notice early that it does not look like a packfile; for the loose object, we have no such header and proceed with the allocation. As for your specific corruption, I can't make heads or tails of it. It is not a single-bit error. The first two bytes of a loose object should always be <0x78, 0x01>, which is the standard zlib deflate header. Your bytes aren't even close, and decoding the rest with a corrupted zlib header seems fruitless. You don't happen to have another copy of the object (or of the data contained in the object, such as the working tree file), do you? It might be interesting to see a comparison of the bytes of the correct data and your corruption. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html