On Wed, Jul 22, 2020 at 9:10 PM brian m. carlson <sandals@xxxxxxxxxxxxxxxxxxxx> wrote: > [...] > Since we cannot extend the v2 format in a backward-compatible way, let's > add a v3 format, which is identical, except for the addition of > capabilities, which are prefixed by an at sign. We add "object-format" > as the only capability and reject unknown capabilities, since we do not > have a network connection and therefore cannot negotiate with the other > side. > [...] > Signed-off-by: brian m. carlson <sandals@xxxxxxxxxxxxxxxxxxxx> > --- > diff --git a/Documentation/technical/bundle-format.txt b/Documentation/technical/bundle-format.txt > @@ -7,6 +7,8 @@ The Git bundle format is a format that represents both refs and Git objects. > == Semantics > > -A Git bundle consists of three parts. > +A Git bundle consists of four parts. Rather than having to worry about updating this each time the format changes, perhaps just say: A Git bundle consists of several parts. > diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh > @@ -281,15 +281,16 @@ test_expect_success 'create bundle 1' ' > test_expect_success 'header of bundle looks right' ' > head -n 4 "$D"/bundle1 && > - head -n 1 "$D"/bundle1 | grep "^#" && > - head -n 2 "$D"/bundle1 | grep "^-$OID_REGEX " && > - head -n 3 "$D"/bundle1 | grep "^$OID_REGEX " && > - head -n 4 "$D"/bundle1 | grep "^$" > + sed -n -e 1p "$D"/bundle1 | grep "^#" && > + sed -n -e 2p "$D"/bundle1 | grep "^@object-format=" && > + sed -n -e 3p "$D"/bundle1 | grep "^-$OID_REGEX " && > + sed -n -e 4p "$D"/bundle1 | grep "^$OID_REGEX " && > + sed -n -e 5p "$D"/bundle1 | grep "^$" > ' I wonder if it would be simpler and cleaner to do something like this instead: cat >expect <<\-EOF && # v3 git bundle @object-format=sha256 -[OID] message [OID] refs/head/master EOF head -n 5 "$D"/bundle1 | sanitize_oid >actual && test_cmp expect actual where sanitize_oid() is a function which converts a hex OID into the literal string "[OID]" (or whatever). I believe I've seen you employ such sanitization functions already in these series in cases when you want to verify that an OID is present in some output but don't care about the actual value. Or perhaps this approach is overkill? Reading further in this patch, I see that you actually do employ this technique in a new test you add to t5607, though you don't bother with OID sanitation in that test. > diff --git a/t/t5607-clone-bundle.sh b/t/t5607-clone-bundle.sh > @@ -94,4 +98,29 @@ test_expect_success 'fetch SHA-1 from bundle' ' > +test_expect_success 'git bundle v3 has expected contents' ' > + git branch side HEAD && > + git bundle create --version=3 bundle HEAD^..side && > + head -n2 bundle >actual && > + cat >expect <<-EOF && > + # v3 git bundle > + @object-format=$(test_oid algo) > + EOF > + test_cmp actual expect && I think you want to swap the 'actual' and 'expect' arguments. > + git bundle verify bundle > +' > + > +test_expect_success 'git bundle v3 rejects unknown extensions' ' > + head -n2 bundle >new && > + echo "@unknown=silly" >>new && > + sed "1,2d" bundle >>new && > + test_must_fail git bundle verify new 2>output && > + test_i18ngrep "unknown capability .unknown=silly." output > +' I worry about passing binary bundle data through 'sed' like this. Historically, some 'sed' implementations would drop the last part of a file if it didn't end with a newline. Even today, not all 'sed' implementations necessarily pass the binary data through unmolested. For instance, on Mac OS, 'sed' adds a newline at the end of the file if the binary bundle blob didn't end with a newline. Perhaps a more reliable approach would be to use Perl to read the entire bundle in as a blob, use s/// to munge the @object-format= line into the @unknown= line, and write it out.