[PATCH v5 0/2] Add mailmap mechanism in cat-file options

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks a lot Christian, Taylor and Junio for review :) I have made the
suggested in this patch series.

= Description

At present, `git-cat-file` command with `--batch-check` and `-s` options
does not complain when `--use-mailmap` option is given. The latter
option is just ignored. Instead, for commit/tag objects, the command
should compute the size of the object after replacing the idents and
report it. So, this patch series makes `-s` and `--batch-check` options
of `git-cat-file` honor mailmap when used with `--use-mailmap` option.

In this patch series we didn't want to change that '%(objectsize)'
always shows the size of the original object even when `--use-mailmap`
is set because first we have the long term plan to unify how the formats
for `git cat-file` and other commands works. And second existing formats
like the "pretty formats" used by `git log` have different options for
fields respecting mailmap or not respecting it (%an is for author name
while %aN for author name respecting mailmap).

I would like to thank my mentors, Christian Couder and John Cai, for all
of their help!
Looking forward to the reviews!

= Patch Organization

- The first patch makes `-s` option to return updated size of the
  <commit/tag> object, when combined with `--use-mailmap` option, after
  replacing the idents using the mailmap mechanism.
- The second patch makes `--batch-check` option to return updated size of
  the <commit/tag> object, when combined with `--use-mailmap` option,
  after replacing the idents using the mailmap mechanism.

= Changes in v5

- The patch series which improves the documentation for `-s`, `--batch`,
  `--batch-check` and `--batch-command` is again part of this patch
  series with patch 3/3 squashed into patches 1/3 and 2/3 as suggested
  by Junio, Taylor and Christian. The doc patch series was perviously
  sent independently for improving the documentation of git cat-file
  options:
  https://lore.kernel.org/git/20221029092513.73982-1-siddharthasthana31@xxxxxxxxx/

- Improve the tests according to run under SHA-256 mode.

Siddharth Asthana (2):
  cat-file: add mailmap support to -s option
  cat-file: add mailmap support to --batch-check option

 Documentation/git-cat-file.txt | 53 ++++++++++++++++++++++---------
 builtin/cat-file.c             | 27 ++++++++++++++++
 t/t4203-mailmap.sh             | 57 ++++++++++++++++++++++++++++++++++
 3 files changed, 123 insertions(+), 14 deletions(-)

Range-diff against v4:
1:  4ae3af37d2 ! 1:  0db3583535 cat-file: add mailmap support to -s option
    @@ Commit message
         Helped-by: Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx>
         Signed-off-by: Siddharth Asthana <siddharthasthana31@xxxxxxxxx>
     
    + ## Documentation/git-cat-file.txt ##
    +@@ Documentation/git-cat-file.txt: OPTIONS
    + 
    + -s::
    + 	Instead of the content, show the object size identified by
    +-	`<object>`.
    ++	`<object>`. If used with `--use-mailmap` option, will show
    ++	the size of updated object after replacing idents using the
    ++	mailmap mechanism.
    + 
    + -e::
    + 	Exit with zero status if `<object>` exists and is a valid
    +
      ## builtin/cat-file.c ##
     @@ builtin/cat-file.c: static int cat_one_file(int opt, const char *exp_type, const char *obj_name,
      
    @@ t/t4203-mailmap.sh: test_expect_success '--mailmap enables mailmap in cat-file f
     +	cat >.mailmap <<-\EOF &&
     +	C O Mitter <committer@xxxxxxxxxxx> Orig <orig@xxxxxxxxxxx>
     +	EOF
    -+	cat >expect <<-\EOF &&
    -+	209
    -+	220
    -+	EOF
    ++	git cat-file commit HEAD | wc -c >expect &&
    ++	git cat-file --use-mailmap commit HEAD | wc -c >>expect &&
     +	git cat-file -s HEAD >actual &&
     +	git cat-file --use-mailmap -s HEAD >>actual &&
     +	test_cmp expect actual
    @@ t/t4203-mailmap.sh: test_expect_success '--mailmap enables mailmap in cat-file f
     +	Orig <orig@xxxxxxxxxxx> C O Mitter <committer@xxxxxxxxxxx>
     +	EOF
     +	git tag -a -m "annotated tag" v3 &&
    -+	cat >expect <<-\EOF &&
    -+	141
    -+	130
    -+	EOF
    ++	git cat-file tag v3 | wc -c >expect &&
    ++	git cat-file --use-mailmap tag v3 | wc -c >>expect &&
     +	git cat-file -s v3 >actual &&
     +	git cat-file --use-mailmap -s v3 >>actual &&
     +	test_cmp expect actual
2:  a692646228 < -:  ---------- cat-file: add mailmap support to --batch-check option
3:  41b4650b24 ! 2:  b8205ede7d doc/cat-file: allow --use-mailmap for --batch options
    @@ Metadata
     Author: Siddharth Asthana <siddharthasthana31@xxxxxxxxx>
     
      ## Commit message ##
    -    doc/cat-file: allow --use-mailmap for --batch options
    +    cat-file: add mailmap support to --batch-check option
    +
    +    Even though the cat-file command with `--batch-check` option does not
    +    complain when `--use-mailmap` option is given, the latter option is
    +    ignored. Compute the size of the object after replacing the idents and
    +    report it instead.
    +
    +    In order to make `--batch-check` option honour the mailmap mechanism we
    +    have to read the contents of the commit/tag object.
    +
    +    There were two ways to do it:
    +
    +    1. Make two calls to `oid_object_info_extended()`. If `--use-mailmap`
    +       option is given, the first call will get us the type of the object
    +       and second call will only be made if the object type is either a
    +       commit or tag to get the contents of the object.
    +
    +    2. Make one call to `oid_object_info_extended()` to get the type of the
    +       object. Then, if the object type is either of commit or tag, make a
    +       call to `read_object_file()` to read the contents of the object.
    +
    +    I benchmarked the following command with both the above approaches and
    +    compared against the current implementation where `--use-mailmap`
    +    option is ignored:
    +
    +    `git cat-file --use-mailmap --batch-all-objects --batch-check --buffer
    +    --unordered`
    +
    +    The results can be summarized as follows:
    +                           Time (mean ± σ)
    +    default               827.7 ms ± 104.8 ms
    +    first approach        6.197 s ± 0.093 s
    +    second approach       1.975 s ± 0.217 s
    +
    +    Since, the second approach is faster than the first one, I implemented
    +    it in this patch.
     
         The command git cat-file can now use the mailmap mechanism to replace
    -    idents with their canonical versions for commit and tag objects. There
    -    are several options like `-s`, `--batch`, `--batch-check` and
    -    `--batch-command` that can be combined with `--use-mailmap`. But the
    -    documentation for `-s`, `--batch`, `--batch-check` and `--batch-command`
    -    doesn't say so. This patch fixes that documentation.
    +    idents with canonical versions for commit and tag objects. There are
    +    several options like `--batch`, `--batch-check` and `--batch-command`
    +    that can be combined with `--use-mailmap`. But the documentation for
    +    `--batch`, `--batch-check` and `--batch-command` doesn't say so. This
    +    patch fixes that documentation.
     
         Mentored-by: Christian Couder <christian.couder@xxxxxxxxx>
         Mentored-by: John Cai <johncai86@xxxxxxxxx>
         Helped-by: Taylor Blau <me@xxxxxxxxxxxx>
    +    Helped-by: Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx>
         Signed-off-by: Siddharth Asthana <siddharthasthana31@xxxxxxxxx>
     
      ## Documentation/git-cat-file.txt ##
     @@ Documentation/git-cat-file.txt: OPTIONS
    - 
    - -s::
    - 	Instead of the content, show the object size identified by
    --	`<object>`.
    -+	`<object>`. If used with `--use-mailmap` option, will show
    -+	the size of updated object after replacing idents using the
    -+	mailmap mechanism.
    - 
    - -e::
    - 	Exit with zero status if `<object>` exists and is a valid
    -@@ Documentation/git-cat-file.txt: OPTIONS
      --batch::
      --batch=<format>::
      	Print object information and contents for each object provided
    @@ Documentation/git-cat-file.txt: OPTIONS
      +
      `--batch-command` recognizes the following commands:
      +
    +
    + ## builtin/cat-file.c ##
    +@@ builtin/cat-file.c: static void batch_object_write(const char *obj_name,
    + 	if (!data->skip_object_info) {
    + 		int ret;
    + 
    ++		if (use_mailmap)
    ++			data->info.typep = &data->type;
    ++
    + 		if (pack)
    + 			ret = packed_object_info(the_repository, pack, offset,
    + 						 &data->info);
    +@@ builtin/cat-file.c: static void batch_object_write(const char *obj_name,
    + 			fflush(stdout);
    + 			return;
    + 		}
    ++
    ++		if (use_mailmap && (data->type == OBJ_COMMIT || data->type == OBJ_TAG)) {
    ++			size_t s = data->size;
    ++			char *buf = NULL;
    ++
    ++			buf = read_object_file(&data->oid, &data->type, &data->size);
    ++			buf = replace_idents_using_mailmap(buf, &s);
    ++			data->size = cast_size_t_to_ulong(s);
    ++
    ++			free(buf);
    ++		}
    + 	}
    + 
    + 	strbuf_reset(scratch);
    +
    + ## t/t4203-mailmap.sh ##
    +@@ t/t4203-mailmap.sh: test_expect_success 'git cat-file -s returns correct size with --use-mailmap for
    + 	test_cmp expect actual
    + '
    + 
    ++test_expect_success 'git cat-file --batch-check returns correct size with --use-mailmap' '
    ++	test_when_finished "rm .mailmap" &&
    ++	cat >.mailmap <<-\EOF &&
    ++	C O Mitter <committer@xxxxxxxxxxx> Orig <orig@xxxxxxxxxxx>
    ++	EOF
    ++	commit_size=`git cat-file commit HEAD | wc -c` &&
    ++	commit_sha=`git log --pretty=format:'%H' -n 1` &&
    ++	echo "$commit_sha commit $commit_size" >expect &&
    ++	commit_size=`git cat-file --use-mailmap commit HEAD | wc -c` &&
    ++	echo "$commit_sha commit $commit_size" >>expect &&
    ++	echo "HEAD" >in &&
    ++	git cat-file --batch-check <in >actual &&
    ++	git cat-file --use-mailmap --batch-check <in >>actual &&
    ++	test_cmp expect actual
    ++'
    ++
    ++test_expect_success 'git cat-file --batch-command returns correct size with --use-mailmap' '
    ++	test_when_finished "rm .mailmap" &&
    ++	cat >.mailmap <<-\EOF &&
    ++	C O Mitter <committer@xxxxxxxxxxx> Orig <orig@xxxxxxxxxxx>
    ++	EOF
    ++	commit_size=`git cat-file commit HEAD | wc -c` &&
    ++	commit_sha=`git log --pretty=format:'%H' -n 1` &&
    ++	echo "$commit_sha commit $commit_size" >expect &&
    ++	commit_size=`git cat-file --use-mailmap commit HEAD | wc -c` &&
    ++	echo "$commit_sha commit $commit_size" >>expect &&
    ++	echo "info HEAD" >in &&
    ++	git cat-file --batch-command <in >actual &&
    ++	git cat-file --use-mailmap --batch-command <in >>actual &&
    ++	test_cmp expect actual
    ++'
    ++
    + test_done
-- 
2.38.1.438.gd2340c8913




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux