I'm not criticizing JGit, guys. It simply doesn't fit into our needs. We're not interested in mapping git commands in java and don't have the same RAM limitations. I know JGit team is doing a great job and we do not intend to build a library with such completeness. Are you guys contributors of JGit? Can you guys point me out to the code that unpacks git objects? The closest I could get was that class: https://github.com/eclipse/jgit/blob/master/org.eclipse.jgit/src/org/eclipse/jgit/internal/storage/file/UnpackedObject.java It seems to be a standard and a non standard format of the packed object, as I read the comments of this method: https://github.com/eclipse/jgit/blob/master/org.eclipse.jgit/src/org/eclipse/jgit/internal/storage/file/UnpackedObject.java#L272 I suspect that the default inflater class of java api expect the object to be in the standard format. What the following comment mean? What's the "Experimental pack-based" format? Is there any docs on the specs of that? We must determine if the buffer contains the standard zlib-deflated stream or the experimental format based on the in-pack object format. Compare the header byte for each format: RFC1950 zlib w/ deflate : 0www1000 : 0 <= www <= 7 Experimental pack-based : Stttssss : ttt = 1,2,3,4 -- Chico Sokol On Wed, May 22, 2013 at 2:59 AM, Shawn Pearce <spearce@xxxxxxxxxxx> wrote: > On Tue, May 21, 2013 at 3:18 PM, Chico Sokol <chico.sokol@xxxxxxxxx> wrote: >> Ok, we discovered that the commit object actually contains the tree >> object's sha1, by reading its contents with python zlib library. >> >> So the bug must be with our java code (we're building a java lib). >> >> Is there any non-standard issue in git's zlib compression? We're >> decompressing its contents with java default zlib api, so it should >> work normally, here's our code, that's printing that wrong output: >> >> import java.io.File; >> import java.io.FileInputStream; >> import java.util.zip.InflaterInputStream; >> import org.apache.commons.io.IOUtils; >> ... >> File obj = new File(".git/objects/25/0f67ef017fcb97b5371a302526872cfcadad21"); >> InflaterInputStream inflaterInputStream = new InflaterInputStream(new >> FileInputStream(obj)); >> System.out.println(IOUtils.readLines(inflaterInputStream)); > ... >>>> Currently, we're trying to parse commit objects. After decompressing >>>> the contents of a commit object file we got the following output: >>>> >>>> commit 191 >>>> author Francisco Sokol <chico.sokol@xxxxxxxxx> 1369140112 -0300 >>>> committer Francisco Sokol <chico.sokol@xxxxxxxxx> 1369140112 -0300 >>>> >>>> first commit > > Your code is broken. IOUtils is probably corrupting what you get back. > After inflating the stream you should see the object type ("commit"), > space, its length in bytes as a base 10 string, and then a NUL ('\0'). > Following that is the tree line, and parent(s) if any. I wonder if > IOUtils discarded the remainder of the line after the NUL and did not > consider the tree line. > > And you wonder why JGit code is confusing. We can't rely on "standard > Java APIs" to do the right thing, because commonly used libraries have > made assumptions that disagree with the way Git works. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html