Re: Reading commit objects

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm not criticizing JGit, guys. It simply doesn't fit into our needs.
We're not interested in mapping git commands in java and don't have
the same RAM limitations.

I know JGit team is doing a great job and we do not intend to build a
library with such completeness.

Are you guys contributors of JGit? Can you guys point me out to the
code that unpacks git objects? The closest I could get was that class:
https://github.com/eclipse/jgit/blob/master/org.eclipse.jgit/src/org/eclipse/jgit/internal/storage/file/UnpackedObject.java

It seems to be a standard and a non standard format of the packed
object, as I read the comments of this method:
https://github.com/eclipse/jgit/blob/master/org.eclipse.jgit/src/org/eclipse/jgit/internal/storage/file/UnpackedObject.java#L272

I suspect that the default inflater class of java api expect the
object to be in the standard format.

What the following comment mean? What's the "Experimental pack-based"
format? Is there any docs on the specs of that?

We must determine if the buffer contains the standard
zlib-deflated stream or the experimental format based
on the in-pack object format. Compare the header byte
for each format:
RFC1950 zlib w/ deflate : 0www1000 : 0 <= www <= 7
Experimental pack-based : Stttssss : ttt = 1,2,3,4


--
Chico Sokol


On Wed, May 22, 2013 at 2:59 AM, Shawn Pearce <spearce@xxxxxxxxxxx> wrote:
> On Tue, May 21, 2013 at 3:18 PM, Chico Sokol <chico.sokol@xxxxxxxxx> wrote:
>> Ok, we discovered that the commit object actually contains the tree
>> object's sha1, by reading its contents with python zlib library.
>>
>> So the bug must be with our java code (we're building a java lib).
>>
>> Is there any non-standard issue in git's zlib compression? We're
>> decompressing its contents with java default zlib api, so it should
>> work normally, here's our code, that's printing that wrong output:
>>
>> import java.io.File;
>> import java.io.FileInputStream;
>> import java.util.zip.InflaterInputStream;
>> import org.apache.commons.io.IOUtils;
>> ...
>> File obj = new File(".git/objects/25/0f67ef017fcb97b5371a302526872cfcadad21");
>> InflaterInputStream inflaterInputStream = new InflaterInputStream(new
>> FileInputStream(obj));
>> System.out.println(IOUtils.readLines(inflaterInputStream));
> ...
>>>> Currently, we're trying to parse commit objects. After decompressing
>>>> the contents of a commit object file we got the following output:
>>>>
>>>> commit 191
>>>> author Francisco Sokol <chico.sokol@xxxxxxxxx> 1369140112 -0300
>>>> committer Francisco Sokol <chico.sokol@xxxxxxxxx> 1369140112 -0300
>>>>
>>>> first commit
>
> Your code is broken. IOUtils is probably corrupting what you get back.
> After inflating the stream you should see the object type ("commit"),
> space, its length in bytes as a base 10 string, and then a NUL ('\0').
> Following that is the tree line, and parent(s) if any. I wonder if
> IOUtils discarded the remainder of the line after the NUL and did not
> consider the tree line.
>
> And you wonder why JGit code is confusing. We can't rely on "standard
> Java APIs" to do the right thing, because commonly used libraries have
> made assumptions that disagree with the way Git works.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]