[PATCH v2 0/2] Add mailmap mechanism in cat-file options

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks a lot Junio for the review :) I have made the suggested changes.

= Description

At present, `git-cat-file` command with `--batch-check` and `-s` options
does not complain when `--use-mailmap` option is given. The latter
option is just ignored. Instead, for commit/tag objects, the command
should compute the size of the object after replacing the idents and
report it. So, this patch series makes `-s` and `--batch-check` options
of `git-cat-file` honor mailmap when used with `--use-mailmap` option.

In this patch series we didn't want to change that '%(objectsize)'
always shows the size of the original object even when `--use-mailmap`
is set because first we have the long term plan to unify how the formats
for `git cat-file` and other commands works. And second existing formats
like the "pretty formats" used bt `git log` have different options for
fields respecting mailmap or not respecting it (%an is for author name
while %aN for author name respecting mailmap).

I would like to thank my mentors, Christian Couder and John Cai, for all
of their help!
Looking forward to the reviews!

= Patch Organization

- The first patch makes `-s` option to return updated size of the
  <commit/tag> object, when combined with `--use-mailmap` option, after
  replacing the idents using the mailmap mechanism.
- The second patch makes `--batch-check` option to return updated size of
  the <commit/tag> object, when combined with `--use-mailmap` option,
  after replacing the idents using the mailmap mechanism.

= Changes in v2:

- The commit messages of both the patches have been improved.
- In the second patch, we were populating the `contentp` field of the
  `object_info` structure when `--batch-check` was combined with
  `--use-mailmap`. Which made us read the contents of tree and blob
  object types as well, which affected the performance. We should only
  be reading the contents for commit or tag object types. The second
  patch has been updated to do just that.

Siddharth Asthana (2):
  cat-file: add mailmap support to -s option
  cat-file: add mailmap support to --batch-check option

 Documentation/git-cat-file.txt |  6 +++++-
 builtin/cat-file.c             | 27 +++++++++++++++++++++++++++
 t/t4203-mailmap.sh             | 32 ++++++++++++++++++++++++++++++++
 3 files changed, 64 insertions(+), 1 deletion(-)

Range-diff against v1:
1:  513ad3b5f7 < -:  ---------- doc/cat-file: allow --use-mailmap for --batch options
2:  6f3dcce9e3 ! 1:  60cf7bc28c cat-file: add mailmap support to -s option
    @@ Metadata
      ## Commit message ##
         cat-file: add mailmap support to -s option
     
    -    Using `git cat-file --use-mailmap` with `-s` option, like the following is
    -    allowed:
    +    Even though the cat-file command with `-s` option does not complain when
    +    `--use-mailmap` option is given, the latter option is ignored. Compute
    +    the size of the object after replacing the idents and report it instead.
     
    -     git cat-file --use-mailmap -s <commit/tag object sha>
    +    In order to make `-s` option honour the mailmap mechanism we have to
    +    read the contents of the commit/tag object. Make use of the call to
    +    `oid_object_info_extended()` to get the contents of the object and store
    +    in `buf`. `buf` is later freed in the function.
     
    -    The current implementation will return the same object size irrespective
    -    of the mailmap option, which is not as useful as it could be. When we
    -    use the mailmap mechanism to replace the idents, the size of the object
    -    can change and `-s` option would be more useful if it shows the size of
    -    the changed object. This patch implements that.
    -
    -    Mentored-by: Christian Couder's avatarChristian Couder <christian.couder@xxxxxxxxx>
    -    Mentored-by: John Cai's avatarJohn Cai <johncai86@xxxxxxxxx>
    +    Mentored-by: Christian Couder <christian.couder@xxxxxxxxx>
    +    Mentored-by: John Cai <johncai86@xxxxxxxxx>
         Signed-off-by: Siddharth Asthana <siddharthasthana31@xxxxxxxxx>
     
      ## Documentation/git-cat-file.txt ##
3:  af90241d32 ! 2:  06c74dd017 cat-file: add mailmap support to --batch-check option
    @@ Metadata
      ## Commit message ##
         cat-file: add mailmap support to --batch-check option
     
    -    Using `git cat-file --use-mailmap` with --batch-check option, like the
    -    following is allowed:
    +    Even though the cat-file command with `--batch-check` option does not
    +    complain when `--use-mailmap` option is given, the latter option is
    +    ignored. Compute the size of the object after replacing the idents and
    +    report it instead.
     
    -     git cat-file --use-mailmap -batch-check
    +    In order to make `--batch-check` option honour the mailmap mechanism we
    +    have to read the contents of the commit/tag object.
     
    -    The current implementation will return the same object size irrespective
    -    of the mailmap option, which is not as useful as it could be. When we
    -    use the mailmap mechanism to replace the idents, the size of the object
    -    can change and --batch-check option would be more useful if it shows the
    -    size of the changed object. This patch implements that.
    +    There were two ways to do it:
     
    -    Mentored-by: Christian Couder's avatarChristian Couder <christian.couder@xxxxxxxxx>
    -    Mentored-by: John Cai's avatarJohn Cai <johncai86@xxxxxxxxx>
    +    1. Make two calls to `oid_object_info_extended()`. If `--use-mailmap`
    +       option is given, the first call will get us the type of the object
    +       and second call will only be made if the object type is either a
    +       commit or tag to get the contents of the object.
    +
    +    2. Make one call to `oid_object_info_extended()` to get the type of the
    +       object. Then, if the object type is either of commit or tag, make a
    +       call to `read_object_file()` to read the contents of the object.
    +
    +    I benchmarked the following command with both the above approaches and
    +    compared against the current implementation where `--use-mailmap`
    +    option is ignored:
    +
    +    `git cat-file --use-mailmap --batch-all-objects --batch-check --buffer
    +    --unordered`
    +
    +    The results can be summarized as follows:
    +                           Time (mean ± σ)
    +    default               827.7 ms ± 104.8 ms
    +    first approach        6.197 s ± 0.093 s
    +    second approach       1.975 s ± 0.217 s
    +
    +    Since, the second approach is faster than the first one, I implemented
    +    it in this patch.
    +
    +    Mentored-by: Christian Couder <christian.couder@xxxxxxxxx>
    +    Mentored-by: John Cai <johncai86@xxxxxxxxx>
         Signed-off-by: Siddharth Asthana <siddharthasthana31@xxxxxxxxx>
     
      ## Documentation/git-cat-file.txt ##
     @@ Documentation/git-cat-file.txt: OPTIONS
    - 	with `--use-mailmap`, `--textconv` or `--filters`. In the case of `--textconv` or
    - 	`--filters` the input lines also need to specify the path, separated by whitespace.
    - 	See the `BATCH OUTPUT` section below for details.
    -+	If used with `--use-mailmap` option, will show the size of updated object after
    -+	replacing idents using the mailmap mechanism.
    + 	`--textconv` or `--filters`, in which case the input lines also
    + 	need to specify the path, separated by whitespace.  See the
    + 	section `BATCH OUTPUT` below for details.
    ++	If used with `--use-mailmap` option, will show the size of
    ++	updated object after replacing idents using the mailmap mechanism.
      
      --batch-command::
      --batch-command=<format>::
     
      ## builtin/cat-file.c ##
    -@@ builtin/cat-file.c: static void print_object_or_die(struct batch_options *opt, struct expand_data *d
    - 
    - static void print_default_format(struct strbuf *scratch, struct expand_data *data)
    - {
    -+	if (use_mailmap && (data->type == OBJ_COMMIT || data->type == OBJ_TAG)) {
    -+		size_t s = data->size;
    -+		*data->info.contentp = replace_idents_using_mailmap((char*)*data->info.contentp, &s);
    -+		data->size = cast_size_t_to_ulong(s);
    -+	}
    -+
    - 	strbuf_addf(scratch, "%s %s %"PRIuMAX"\n", oid_to_hex(&data->oid),
    - 		    type_name(data->type),
    - 		    (uintmax_t)data->size);
     @@ builtin/cat-file.c: static void batch_object_write(const char *obj_name,
    - 			       struct packed_git *pack,
    - 			       off_t offset)
    - {
    -+	void *buf = NULL;
    -+
      	if (!data->skip_object_info) {
      		int ret;
      
     +		if (use_mailmap)
    -+			data->info.contentp = &buf;
    ++			data->info.typep = &data->type;
     +
      		if (pack)
      			ret = packed_object_info(the_repository, pack, offset,
      						 &data->info);
     @@ builtin/cat-file.c: static void batch_object_write(const char *obj_name,
    - 		print_object_or_die(opt, data);
    - 		batch_write(opt, "\n", 1);
    - 	}
    + 			fflush(stdout);
    + 			return;
    + 		}
     +
    -+	free(buf);
    - }
    ++		if (use_mailmap && (data->type == OBJ_COMMIT || data->type == OBJ_TAG)) {
    ++			size_t s = data->size;
    ++			char *buf = NULL;
    ++
    ++			buf = read_object_file(&data->oid, &data->type, &data->size);
    ++			buf = replace_idents_using_mailmap(buf, &s);
    ++			data->size = cast_size_t_to_ulong(s);
    ++
    ++			free(buf);
    ++		}
    + 	}
      
    - static void batch_one_object(const char *obj_name,
    + 	strbuf_reset(scratch);
     
      ## t/t4203-mailmap.sh ##
     @@ t/t4203-mailmap.sh: test_expect_success 'git cat-file -s returns correct size with --use-mailmap' '
-- 
2.38.0.rc1.8.g9592ff2ba4




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux