On Tue, Aug 4, 2009 at 1:01 AM, George Spelvin<linux@xxxxxxxxxxx> wrote: >> Would there happen to be a SHA1 implementation around that can compute >> the SHA1 without first decompressing the data? Databases gain a lot of >> speed by using special algorithms that can directly operate on the >> compressed data. > > I can't imagine how. In general, this requires that the compression > be carefully designed to be compatible with the algorithms, and SHA1 > is specifically designed to depend on every bit of the input in > an un-analyzable way. A simple start would be to feed each byte as it is decompressed directly into the sha code and avoid the intermediate buffer. Removing the buffer reduces cache pressure. > Also, git normally avoids hashing objects that it doesn't need > uncompressed for some other reason. git-fsck is a notable exception, > but I think the idea of creating special optimized code paths for that > interferes with its reliability and robustness goals. Agreed that there is no real need for this, just something to play with if you are trying for a speed record. I'd much rather have a solution for the rebase problem where one side of the diff has moved to a different file and rebase can't figure it out. > -- Jon Smirl jonsmirl@xxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html