Re: Manually decoding a git object

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Philip Oakley <philipoakley@xxxxxxx> writes:

> From: "Thomas Rast" <trast@xxxxxxxxxxx> Sent: Monday, February 20,
> 2012 8:29 AM
>>
>> The SHA1 is over the decompressed object contents.  The file simply
>> holds a zlib-compressed stream of those contents.  (It's pretty much
>> like gzip without the file header.)
>>
>> You can use any bindings to zlib and something that does sha1, e.g. in
>> python:
>>
>>  $ cd g/.git/objects/aa/  # my git.git
>>  $ ls
>>  592bda986a8380b64acd8cbb3d5bdfcbc0834d
>> 6322a757bee31919f54edcc127608a3d724c99
>>  $ python
>>  Python 2.7.2 (default, Aug 19 2011, 20:41:43) [GCC] on linux2
>>  Type "help", "copyright", "credits" or "license" for more information.
>>  >>> import hashlib
>>  >>>
>> hashlib.sha1(open('592bda986a8380b64acd8cbb3d5bdfcbc0834d').read().decode('zlib')).digest().encode('hex')
>>  'aa592bda986a8380b64acd8cbb3d5bdfcbc0834d'
>>
>> Notice that the first byte of the hash goes into the directory name.
>>
>
> At the moment I'm in a Catch 22 situation where I can't make the first
> step of examining the deflated contents, so I can't do all those next
> steps to get the sha1 etc.. Have I misunderstood your suggestions?

Huh?  The method I showed does not rely on knowing the SHA1.  The fact
that I used it on a properly filed away (by its SHA1) object file is
immaterial, if perhaps confusing.

I can untangle that python expression for you:

hashlib.sha1(foo).digest()      gives the SHA1 digest of the string foo, as a (binary) string
foo.encode('hex')               turns foo from (binary) string into its hex representation
open('filename').read()         opens the file called filename, and returns its whole contents
foo.decode('zlib')              applies the zlib decompressor to foo, and returns the resulting data

So that trick works for any file[*], and you can then use its results to
file it back where it needs to go.


[*] that is sufficiently small for Python to hold it in memory, but git
shares the same problems in that department.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]