On Tue, Mar 22, 2011 at 02:27, madmarcos <fru574@xxxxxxxxxxx> wrote: > I am learning about the Git packfile and currently trying to reproduce (in > Java) what I believe to be the SHA1 20-byte checksum for the entire > packfile. I take the byte array from, and including, the "PACK" 4-byte > header to the end of the last packaged object's compressed data. Everything > I have read indicates that the next 20 bytes is the SHA1 checksum for the > entire packfile. That is correct. > The 20-byte checksum that is part of the byte array received from Git is: > B910248BF9B63AC53595E3835CA57BDAF08DA830 > > I use the following to calculate my own SHA1 checksum: > crypt = MessageDigest.getInstance("SHA-1"); > crypt.reset(); > crypt.update(testData); > byte [] result = crypt.digest(); Looks right, assuming that testData is "PACK..." up to but not including the last 20 bytes. :-) > My result ends up as: B910248BF9B63AC53595E3835CA57BDAF08DA813 > > I am baffled at how only the last byte of my result can be different from > Git's (if I am using the correct part of the byte stream). If the only > problem was the range of data passed to digest() then the entire calculated > checksum would most likely look different. I've never seen SHA-1 produce a value this close to the expected value before. My first guess is that its a problem elsewhere in your code, like your byte[]->hex formatter, or the code that is reading in the 20 bytes from Git that has the trailer is reading the wrong thing. > Note: I use the same code to generate test SHA1 ids for each contained > object and they match the references in the tree objects. This problem > currently only involves calculating the checksum over the entire packfile. But aside from just learning about Git, if you want to work with Git in Java... use JGit[1]. :-) [1] http://www.eclipse.org/jgit/ -- Shawn. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html