Hi, On Wed, 9 May 2007, Daniel Barkalow wrote: > On Wed, 9 May 2007, Johannes Schindelin wrote: > > > On Tue, 8 May 2007, Karl Hasselstr�rote: > > > > > On 2007-05-08 23:07:04 +0200, Johannes Schindelin wrote: > > > > > > > On Tue, 8 May 2007, Karl Hasselstr�rote: > > > > > > > > > On 2007-05-08 17:10:47 +0200, Johannes Schindelin wrote: > > > > > > > > > > > + char *`, but is actually expected to be a pointer to `unsigned > > > > > > + char[20]`. This variable will contain the big endian version of the > > > > > > + 40-character hex string representation of the SHA-1. > > > > > > > > > > Either it should be "unsigned char[40]" (or possibly 41 with a > > > > > terminating \0), or else you shouldn't be talking about > > > > > hexadecimal since it's just a 20-byte big-endian unsigned integer. > > > > > (A third possibility is that I'm totally confused.) > > > > > > > > It is 40 hex-character, but 20 _byte_. If you have any ideas how to > > > > formulate that better than I did... > > > > > > I think this is less confusing: > > > > > > This variable will contain the 160-bit SHA-1. > > > > > > It avoids talking of hex, since it's not really stored in hex format > > > any more than any other binary number with a number of bits divisible > > > by four. And it avoids saying big-endian, which is not relevant anyway > > > since we don't use hashes as integers. > > > > Well, I do not buy into that. First, we _have_ to say that it is > > big-endian. It was utterly confusing to _me_ that the hash was not little > > endian, as I expected on an Intel processor. > > SHA-1 is defined as producing a octet sequence, and to have a canonical > hex digit sequence conversion with the high nibbles first. Internally, it > is canonically specified using big-endian math, but the same algorithm > could equally be specified with little-endian math and different rules for > input and output. > > > And I'd rather mention the hex representation (what you see in git-log and > > git-ls-tree). This helps debugging, believe me. > > It's kind of important to distinguish between the hex representation and > the octet representation, because your code will not work at all if you > use the wrong one. And "unsigned char *" or "unsigned char[20]" is always > the octets; the hex is always "char *". Primarily mentioning the one that > is more intuitive but less frequently used doesn't help with understanding > the actual code. That's a really good idea, to point out that "unsigned char *" refers to octets, while "char *" refers to the ASCII representation. I will add this, together with a simple example (the initial commit). Ciao, Dscho