This is the second round of adding a hashing example to user-manual.txt. --- Changes in v2: - Do not go into detail about hashing in the history. - Change code according to coding guidelines. - Fix a typo (s/asume/assume/) and change the wording of that sentence. - Write Git instead of `git`. - To fit the whole document, change sample content to "Hello world", lentgh 12. - Add verification of hash using `git hash-object`. - Provide for empty lines around code blocks. --- Dirk Gouders (1): Documentation/user-manual.txt: example for generating object hashes Documentation/user-manual.txt | 36 +++++++++++++++++++++++++++++++++-- 1 file changed, 34 insertions(+), 2 deletions(-) Range-diff against v1: 1: 6995f866e7 ! 1: 568c59d69f Documentation/user-manual.txt: example for generating object hashes @@ Metadata ## Commit message ## Documentation/user-manual.txt: example for generating object hashes - If someone spends the time to work through the documentation, the - subject "hashes" can lead to contradictions: + Add a simple example on how object hashes can be generated manually. - The README of the initial commit states hashes are generated from - compressed data (which changed very soon), whereas - Documentation/user-manual.txt says they are generated from original - data. - - Don't give doubts a chance: clarify this and present a simple example - on how object hashes can be generated manually. + Further, because the document suggests to have a look at the initial + commit, clarify that some details changed since that time. Signed-off-by: Dirk Gouders <dirk@xxxxxxxxxxx> ## Documentation/user-manual.txt ## -@@ Documentation/user-manual.txt: that is used to name the object is the hash of the original data +@@ Documentation/user-manual.txt: that not only specifies their type, but also provides size information + about the data in the object. It's worth noting that the SHA-1 hash + that is used to name the object is the hash of the original data plus this header, so `sha1sum` 'file' does not match the object name - for 'file'. - -+Starting with the initial commit, hashing was done on the compressed -+data and the file README of that commit explicitely states this: -+ -+"The SHA1 hash is always the hash of the _compressed_ object, not the -+original one." +-for 'file'. ++for 'file' (the earliest versions of Git hashed slightly differently ++but the conclusion is still the same). + -+This changed soon after that with commit -+d98b46f8d9a3 (Do SHA1 hash _before_ compression.). Unfortunately, the -+commit message doesn't provide the detailed reasoning. ++The following is a short example that demonstrates how these hashes ++can be generated manually: + -+The following is a short example that demonstrates how hashes can be -+generated manually: ++Let's assume a small text file with some simple content: + -+Let's asume a small text file with the content "Hello git.\n" +------------------------------------------------- -+$ cat > hello.txt <<EOF -+Hello git. -+EOF ++$ echo "Hello world" >hello.txt +------------------------------------------------- + -+We can now manually generate the hash `git` would use for this file: ++We can now manually generate the hash Git would use for this file: + +- The object we want the hash for is of type "blob" and its size is -+ 11 bytes. ++ 12 bytes. + +- Prepend the object header to the file content and feed this to -+ sha1sum(1): ++ `sha1sum`: + +------------------------------------------------- -+$ printf "blob 11\0" | cat - hello.txt | sha1sum -+7217614ba6e5f4e7db2edaa2cdf5fb5ee4358b57 . ++$ { printf "blob 12\0"; cat hello.txt; } | sha1sum ++802992c4220de19a90767f3000a79a31b98d0df7 - +------------------------------------------------- + ++This manually constructed hash can be verified using `git hash-object` ++which of course hides the addition of the header: ++ ++------------------------------------------------- ++$ git hash-object hello.txt ++802992c4220de19a90767f3000a79a31b98d0df7 ++------------------------------------------------- + As a result, the general consistency of an object can always be tested independently of the contents or the type of the object: all objects can - be validated by verifying that (a) their hashes match the content of the +@@ Documentation/user-manual.txt: $ git switch --detach e83c5163 + ---------------------------------------------------- + + The initial revision lays the foundation for almost everything Git has +-today, but is small enough to read in one sitting. ++today (even though details may differ in a few places), but is small ++enough to read in one sitting. + + Note that terminology has changed since that revision. For example, the + README in that revision uses the word "changeset" to describe what we -- 2.43.0