I'm wondering if anyone happens to know of software to dump all a git repo's metadata, both stored and derived, to a format - sql, xml, csv, whatever - that is easily importable into a database / manipulated programmatically. Background, for the interested: There is git repo HAPPY and and a separate git repo with branch SAD. Repo HAPPY is canonical; branch SAD is in a separate fork repo. Files from HAPPY have been copied over on an irregular basis to SAD. So SAD has a mixture of files that are exactly the same as (the one in some commit to) HAPPY, and files that have diverged since the initial copy over from HAPPY as per the needs of the SAD fork. The end goal is to get a diff that shows only fork-specific changes. Identify the common file ancestor, and then diff the most recent fork'ed file against that. Or put another way: (a) Remove any files from SAD's most recent commit that are exactly the same as any commit to HAPPY. (b) For each file still in SAD's most recent commit, walk backwards in SAD until a version is found that exists in HAPPY. For (a) the below two git commands plus a little scripting look like enough: # HAPPY: Get all file hases for a repo git verify-pack -v .git/objects/pack/*.idx > HAPPY.hashes grep ' blob ' git.hashes | awk '{print $1}' > HAPPY.blobs # SAD: Get hases and paths from current checkout git ls-files --full-name -s | awk '{print $2" "$4}' > SAD.blobs I haven't looked into (b) as much yet, but at the moment I'm thinking of using git log to get a chronological list of commit hashes, then walk backwards, at each checkout using git ls-files to dump the tree's hashes to a separate file. -- Daniel J Clark - off-list: djc @ first initial last name . us -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html