Re: [PATCH v4] rev-list: support human-readable output for `--disk-usage`

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



"Li Linchao via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes:

> diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
> index 195e74eec63..5d3880874fc 100644
> --- a/Documentation/rev-list-options.txt
> +++ b/Documentation/rev-list-options.txt
> @@ -242,6 +242,7 @@ ifdef::git-rev-list[]
>  	to `/dev/null` as the output does not have to be formatted.
>  
>  --disk-usage::
> +--disk-usage=human::
>  	Suppress normal output; instead, print the sum of the bytes used
>  	for on-disk storage by the selected commits or objects. This is
>  	equivalent to piping the output into `git cat-file
> @@ -249,6 +250,8 @@ ifdef::git-rev-list[]
>  	faster (especially with `--use-bitmap-index`). See the `CAVEATS`
>  	section in linkgit:git-cat-file[1] for the limitations of what
>  	"on-disk storage" means.
> +	When it accepts a value `human`, like: `--disk-usage=human`, this
> +	means to print objects size in human readable format.

"When it accepts" sounds wrong, because it implies there are two
cases, i.e. the user gives "--disk-usage=human" and the command
somehow decides to accept it, or to reject it, which is not what is
going on.  How about phrasing it more like ...

    With the optional value `human`, on-disk storage size is shown
    in human-readable string (e.g. 12.23 KiB, 3.50 MiB).

perhaps?

> @@ -46,6 +46,7 @@ static const char rev_list_usage[] =
>  "    --parents\n"
>  "    --children\n"
>  "    --objects | --objects-edge\n"
> +"    --disk-usage | --disk-usage=human\n"

Writing it like

	"--disk-usage[=human]\n"

would be more in line with ...

>  "    --[no-]object-names\n"

... this one.

> @@ -368,6 +370,17 @@ static int show_object_fast(
>  	return 1;
>  }
>  
> +static void print_disk_usage(off_t size)
> +{
> +	struct strbuf sb = STRBUF_INIT;
> +	if (human_readable)
> +		strbuf_humanise_bytes(&sb, size);
> +	else
> +		strbuf_addf(&sb, "%"PRIuMAX, (uintmax_t)size);
> +	puts(sb.buf);
> +	strbuf_release(&sb);
> +}

Hmph, I am not sure if we want to make it a helper like this.  The
normal case does not need to prepare the string into a strbuf but
just can send the output to the standard output stream.

It is probably easy to fix, like so:

	if (!human_readable) {
		printf("%" PRIuMAX "\n", disk_usage);
	} else {
		strbuf sb = STRBUF_INIT;
		strbuf_humanise_bytes(&sb, disk_usage);
		puts(sb.buf);
		strbuf_release(&sb);
	}

>  static inline int parse_missing_action_value(const char *value)
>  {
>  	if (!strcmp(value, "error")) {
> @@ -473,6 +486,7 @@ static int try_bitmap_disk_usage(struct rev_info *revs,
>  				 int filter_provided_objects)
>  {
>  	struct bitmap_index *bitmap_git;
> +	off_t size_from_bitmap;

Questionably named variable.

>  	if (!show_disk_usage)
>  		return -1;
> @@ -481,8 +495,8 @@ static int try_bitmap_disk_usage(struct rev_info *revs,
>  	if (!bitmap_git)
>  		return -1;
>  
> -	printf("%"PRIuMAX"\n",
> -	       (uintmax_t)get_disk_usage_from_bitmap(bitmap_git, revs));
> +	size_from_bitmap = get_disk_usage_from_bitmap(bitmap_git, revs);
> +	print_disk_usage(size_from_bitmap);

It makes sense to make the function declare how it gets disk usage
in its name, but once we call the function to get what we want,
there is no need to keep saying we got it from bitmap.  If we ever
gained another function that obtains the disk usage from other
means, then this part of the code would become

	if (use_bitmap)
		variable = get_disk_usage_from_bitmap(...);
	else
		variable = get_disk_usage_from_other_means(...);

	print_disk_usage(variable);

Now, what should that variable be named?  It is clear that it
shouldn't be named "from_bitmap" at all, because it may very well
have come from somewhere else.

You call it simply 'size' in print_disk_usage(), and that would
probably be a good name to use here.  Or even better, "disk_usage".

However, I think all of the above badness stems from the horrible
way the existing code deals with "use_bitmap_index", and it is not
in the scope of this patch to fix it, let's keep the above code
as-is.  This is only called from the "if (use_bitmap_index)" so
the size can only be from bitmap.

> @@ -624,7 +638,20 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
>  			continue;
>  		}
>  
> -		if (!strcmp(arg, "--disk-usage")) {
> +		if (skip_prefix(arg, "--disk-usage", &arg)) {
> +			if (*arg == '=') {
> +				if (!strcmp(++arg, "human")) {
> +					human_readable = 1;
> +				} else
> +					die(_("invalid value for '%s': '%s', try --disk-usage=human"), "--disk-usage", arg);

This makes the hardcoded "--disk-usage=human" subject to
translation, which it should not.  Perhaps like this:

	die(_("invalid value for '%s': '%s', the only allowed format is "%s"),
	    "--disk-usage=<format>", arg, "human");

The main point is to ensure that "human" that we are not allowing
the user to type in their language is left out of the
_("translatable string").

> +			} else if (*arg) {
> +				/*
> +				* Arguably should goto a label to continue chain of ifs?
> +				* Doesn't matter unless we try to add --disk-usage-foo
> +				* afterwards
> +				*/

Wrong indentation for a long comment?

> +				usage(rev_list_usage);
> +			}



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux