Hey there. For some special use case, I wanted to write a parser for the patch format created by git format-patch, especially where I can separate headers, commit message and the actual unified diffs. There seems unfortunately only little (written) definition of that format, git-format-patch(1) merely says it's in UNIX mailbox format (which itself is, AFAIK, not really formally defined). Anyway, it seems to turn out, that no escaping is done for the commit message in the patch format and that this can cause actual breakage with valid commit messages. Consider the following example: 1. I create a fresh repo, add a test file and use a commit message, which contains a From line (even with the "magic" timestamp) and some made up commit id (0000...) ~/test$ git init foo; cd foo Initialized empty Git repository in /home/calestyo/test/foo/.git/ ~/test/foo$ echo a >f; git add f ~/test/foo$ git commit -m "msg1 From 0000000000000000000000000000000061603705 Mon Sep 17 00:00:00 2001 -- ---" [master (root-commit) c08debc] msg1 1 file changed, 1 insertion(+) create mode 100644 f 2. The format-patch for that looks already suspicious: - The From line is not escaped (as some variants of mbox would do, some properly some, causing corruption by the escaping with > itself). - What the format may think of as a separator after the commit message (namely the ---) cannot be used as that either, as a --- in the commit message is again not escaped. ~/test/foo$ git format-patch --root; cat 0001-msg1.patch; rm -f 0001-msg1.patch 0001-msg1.patch From c08debcc502c78786ec71d50686ff0445a13b654 Mon Sep 17 00:00:00 2001 From: Christoph Anton Mitterer <mail@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx> Date: Mon, 4 Nov 2024 19:58:45 +0100 Subject: [PATCH] msg1 From 0000000000000000000000000000000061603705 Mon Sep 17 00:00:00 2001 -- --- --- f | 1 + 1 file changed, 1 insertion(+) create mode 100644 f diff --git a/f b/f new file mode 100644 index 0000000..7898192 --- /dev/null +++ b/f @@ -0,0 +1 @@ +a -- 2.45.2 3. Adding a 2nd commit, this time using the unified diff from the above patch as commit message body(!). ~/test/foo$ echo b >>f; git add f ~/test/foo$ git commit -m "msg2 diff --git a/f b/f new file mode 100644 index 0000000..7898192 --- /dev/null +++ b/f @@ -0,0 +1 @@ +a -- 2.45.2" [master 6bbe38c] msg2 1 file changed, 1 insertion(+) ~/test/foo$ git format-patch --root 0001-msg1.patch 0002-msg2.patch 4. To no surprise, git itself of course knows the difference between commit message and actual patch, as show e.g. by the following, where the commit message is indented (by git): $ git log --patch | cat commit 6bbe38c33680239ac9767e0e5095f9f32ad41ade Author: Christoph Anton Mitterer <mail@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx> Date: Mon Nov 4 20:00:20 2024 +0100 msg2 diff --git a/f b/f new file mode 100644 index 0000000..7898192 --- /dev/null +++ b/f @@ -0,0 +1 @@ +a -- 2.45.2 diff --git a/f b/f index 7898192..422c2b7 100644 --- a/f +++ b/f @@ -1 +1,2 @@ a +b commit c08debcc502c78786ec71d50686ff0445a13b654 Author: Christoph Anton Mitterer <mail@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx> Date: Mon Nov 4 19:58:45 2024 +0100 msg1 From 0000000000000000000000000000000061603705 Mon Sep 17 00:00:00 2001 -- --- diff --git a/f b/f new file mode 100644 index 0000000..7898192 --- /dev/null +++ b/f @@ -0,0 +1 @@ +a 5. Next I try whether git am can use the patches created above in a fresh repo: ~/test/foo$ cd ..; git init bar; cd bar Initialized empty Git repository in /home/calestyo/test/bar/.git/ ~/test/bar$ git am ../foo/0001-msg1.patch Patch is empty. hint: When you have resolved this problem, run "git am --continue". hint: If you prefer to skip this patch, run "git am --skip" instead. hint: To record the empty patch as an empty commit, run "git am --allow-empty". hint: To restore the original branch and stop patching, run "git am --abort". hint: Disable this message with "git config advice.mergeConflict false" That already fails for the first patch, the reason probably being my From 0000... line in the commit message. 6. So trying again with simply that From 000.. line removed ~/test/bar$ sed -i '/^From 00000/d' ../foo/0001-msg1.patch ~/test/bar$ git am ../foo/0001-msg1.patch fatal: previous rebase directory .git/rebase-apply still exists but mbox given. and again on a freshly created repo: ~/test/bar$ cd ..; rm -rf bar; git init bar; cd bar Initialized empty Git repository in /home/calestyo/test/bar/.git/ ~/test/bar$ git am ../foo/0001-msg1.patch Applying: msg1 applying to an empty history Ah, now it works, so it was indeed the (unusual but still valid commit message). 7. Now that 0001-msg1.patch is applied, let's try the 2nd patch: ~/test/bar$ git am ../foo/0002-msg2.patch Applying: msg2 error: f: already exists in index Patch failed at 0001 msg2 hint: Use 'git am --show-current-patch=diff' to see the failed patch hint: When you have resolved this problem, run "git am --continue". hint: If you prefer to skip this patch, run "git am --skip" instead. hint: To restore the original branch and stop patching, run "git am --abort". hint: Disable this message with "git config advice.mergeConflict false" ~/test/bar$ And again, ... the reason most likely git not being able to get that the "first diff" is not actually a diff but part of the commit message. btw and shamelessly off-topic question: Any chance that git format-patch / am will ever support keeping track of the branch/merge history of generated / applied patches? That would be really neat. Thanks, Chris. PS: Not subscribed, so please keep me CCed in case you want me to read any possible replies :-)