These patches add fixes and features to the "git patch-id" command, mostly discovered through our usage of patch-id in the revup project (https://github.com/Skydio/revup). On top of that I've tried to make general cleanup changes where I can. Summary: 1: Fixed a bug in the combination of --stable with binary files and header-only, and expanded the test to cover both binary and non-binary files. 2: Switch internal usage of patch-id in rebase / cherry-pick to use the stable variant to reduce the number of code paths and improve testing for bugs like above. 3: Fixed bugs with patch-id and binary diffs. Previously patch-id did not behave correctly for binary diffs regardless of whether "--binary" was given to "diff". 4: Fixed bugs with patch-id and mode changes. Previously mode changes were incorrectly excluded from the patch-id. 5: Add a new "--include-whitespace" mode to patch-id that prevents whitespace from being stripped during id calculation. Also add a config option for the same behavior. 6: Remove unused prefix from patch-id logic. 7: Update format-patch doc to specify when patch-ids are going to be equal to those generated by "git patch-id". V1->V2: Fixed comment style V2->V3: The ---/+++ lines no longer get added to the patch-id of binary diffs. Also added patches 3-7 in the series. Signed-off-by: Jerry Zhang jerry@xxxxxxxxxx Jerry Zhang (7): patch-id: fix stable patch id for binary / header-only patch-id: use stable patch-id for rebases builtin: patch-id: fix patch-id with binary diffs patch-id: fix patch-id for mode changes builtin: patch-id: add --include-whitespace as a command mode builtin: patch-id: remove unused diff-tree prefix documentation: format-patch: clarify requirements for patch-ids to match Documentation/git-format-patch.txt | 4 +- Documentation/git-patch-id.txt | 25 +++++-- builtin/log.c | 2 +- builtin/patch-id.c | 114 +++++++++++++++++++++-------- diff.c | 75 +++++++++---------- diff.h | 2 +- patch-ids.c | 10 +-- patch-ids.h | 2 +- t/t3419-rebase-patch-id.sh | 63 +++++++++++++--- t/t4204-patch-id.sh | 95 ++++++++++++++++++++++-- 10 files changed, 291 insertions(+), 101 deletions(-) base-commit: d420dda0576340909c3faff364cfbd1485f70376 Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1359%2Fjerry-skydio%2Fjerry%2Frevup%2Fmaster%2Fpatch_ids-v3 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1359/jerry-skydio/jerry/revup/master/patch_ids-v3 Pull-Request: https://github.com/gitgitgadget/git/pull/1359 Range-diff vs v2: 1: 945508df7b6 ! 1: 7d4c2e91ce0 patch-id: fix stable patch id for binary / header-only @@ Commit message populates the object ids, and normal which populates the text diff. All branches will end up flushing the hunk. + Don't populate the ---a/ and +++b/ lines for binary diffs, to correspond + to those lines not being present in the "git diff" text output. + This is necessary because we advertise that the patch-id calculated + internally and used in format-patch is the same that what the + builtin "git patch-id" would produce when piped from a diff. + Update the test to run on both binary and normal files. Signed-off-by: Jerry Zhang <jerry@xxxxxxxxxx> ## diff.c ## @@ diff.c: static int diff_get_patch_id(struct diff_options *options, struct object_id *oid - the_hash_algo->update_fn(&ctx, p->two->path, len2); + if (p->one->mode == 0) { + patch_id_add_string(&ctx, "newfilemode"); + patch_id_add_mode(&ctx, p->two->mode); +- patch_id_add_string(&ctx, "---/dev/null"); +- patch_id_add_string(&ctx, "+++b/"); +- the_hash_algo->update_fn(&ctx, p->two->path, len2); + } else if (p->two->mode == 0) { + patch_id_add_string(&ctx, "deletedfilemode"); + patch_id_add_mode(&ctx, p->one->mode); +- patch_id_add_string(&ctx, "---a/"); +- the_hash_algo->update_fn(&ctx, p->one->path, len1); +- patch_id_add_string(&ctx, "+++/dev/null"); +- } else { +- patch_id_add_string(&ctx, "---a/"); +- the_hash_algo->update_fn(&ctx, p->one->path, len1); +- patch_id_add_string(&ctx, "+++b/"); +- the_hash_algo->update_fn(&ctx, p->two->path, len2); } - if (diff_header_only) @@ diff.c: static int diff_get_patch_id(struct diff_options *options, struct object the_hash_algo->update_fn(&ctx, oid_to_hex(&p->two->oid), the_hash_algo->hexsz); - continue; +- } +- +- xpp.flags = 0; +- xecfg.ctxlen = 3; +- xecfg.flags = XDL_EMIT_NO_HUNK_HDR; +- if (xdi_diff_outf(&mf1, &mf2, NULL, +- patch_id_consume, &data, &xpp, &xecfg)) +- return error("unable to generate patch-id diff for %s", +- p->one->path); + } else { ++ if (p->one->mode == 0) { ++ patch_id_add_string(&ctx, "---/dev/null"); ++ patch_id_add_string(&ctx, "+++b/"); ++ the_hash_algo->update_fn(&ctx, p->two->path, len2); ++ } else if (p->two->mode == 0) { ++ patch_id_add_string(&ctx, "---a/"); ++ the_hash_algo->update_fn(&ctx, p->one->path, len1); ++ patch_id_add_string(&ctx, "+++/dev/null"); ++ } else { ++ patch_id_add_string(&ctx, "---a/"); ++ the_hash_algo->update_fn(&ctx, p->one->path, len1); ++ patch_id_add_string(&ctx, "+++b/"); ++ the_hash_algo->update_fn(&ctx, p->two->path, len2); ++ } + + if (fill_mmfile(options->repo, &mf1, p->one) < 0 || + fill_mmfile(options->repo, &mf2, p->two) < 0) + return error("unable to read files to diff"); @@ diff.c: static int diff_get_patch_id(struct diff_options *options, struct object + patch_id_consume, &data, &xpp, &xecfg)) + return error("unable to generate patch-id diff for %s", + p->one->path); - } -- -- xpp.flags = 0; -- xecfg.ctxlen = 3; -- xecfg.flags = XDL_EMIT_NO_HUNK_HDR; -- if (xdi_diff_outf(&mf1, &mf2, NULL, -- patch_id_consume, &data, &xpp, &xecfg)) -- return error("unable to generate patch-id diff for %s", -- p->one->path); -- ++ } if (stable) flush_one_hunk(oid, &ctx); } ## t/t3419-rebase-patch-id.sh ## @@ t/t3419-rebase-patch-id.sh: test_expect_success 'setup: 500 lines' ' - git cherry-pick main >/dev/null 2>&1 - ' + git add newfile && + git commit -q -m "add small file" && + +- git cherry-pick main >/dev/null 2>&1 +-' ++ git cherry-pick main >/dev/null 2>&1 && -test_expect_success 'setup attributes' ' - echo "file binary" >.gitattributes --' -- ++ git branch -f squashed main && ++ git checkout -q -f squashed && ++ git reset -q --soft HEAD~2 && ++ git commit -q -m squashed + ' + test_expect_success 'detect upstream patch' ' - git checkout -q main && +- git checkout -q main && ++ git checkout -q main^{} && scramble file && + git add file && + git commit -q -m "change big file again" && @@ t/t3419-rebase-patch-id.sh: test_expect_success 'detect upstream patch' ' - git checkout -q other^{} && - git rebase main && - git rev-list main...HEAD~ >revs && -- test_must_be_empty revs -+ test_must_be_empty revs && + test_must_be_empty revs + ' + ++test_expect_success 'detect upstream patch binary' ' + echo "file binary" >.gitattributes && + git checkout -q other^{} && + git rebase main && + git rev-list main...HEAD~ >revs && + test_must_be_empty revs && -+ rm .gitattributes - ' - ++ test_when_finished "rm .gitattributes" ++' ++ test_expect_success 'do not drop patch' ' -@@ t/t3419-rebase-patch-id.sh: test_expect_success 'do not drop patch' ' - git commit -q -m squashed && +- git branch -f squashed main && +- git checkout -q -f squashed && +- git reset -q --soft HEAD~2 && +- git commit -q -m squashed && git checkout -q other^{} && test_must_fail git rebase squashed && - git rebase --quit -+ git rebase --abort && ++ test_when_finished "git rebase --abort" ++' ++ ++test_expect_success 'do not drop patch binary' ' + echo "file binary" >.gitattributes && + git checkout -q other^{} && + test_must_fail git rebase squashed && -+ git rebase --abort && -+ rm .gitattributes ++ test_when_finished "git rebase --abort" && ++ test_when_finished "rm .gitattributes" ' test_done 2: 30ec43cd129 ! 2: 25e28b7dab3 patch-id: use stable patch-id for rebases @@ Commit message patch-id: use stable patch-id for rebases Git doesn't persist patch-ids during the rebase process, so there is - no need to specifically invoke the unstable variant. - - This allows the legacy unstable id logic to be cleaned up. + no need to specifically invoke the unstable variant. Use the stable + logic for all internal patch-id calculations to minimize the number of + code paths and improve test coverage. Signed-off-by: Jerry Zhang <jerry@xxxxxxxxxx> -: ----------- > 3: 21642128927 builtin: patch-id: fix patch-id with binary diffs -: ----------- > 4: 6e07cfd5691 patch-id: fix patch-id for mode changes -: ----------- > 5: bbaa2425ad0 builtin: patch-id: add --include-whitespace as a command mode -: ----------- > 6: a1f6f36d487 builtin: patch-id: remove unused diff-tree prefix -: ----------- > 7: 69440797f30 documentation: format-patch: clarify requirements for patch-ids to match -- gitgitgadget