Re: [RFC Redux] strbuf: Add method to convert byte-size to human readable form

Junio C Hamano <gitster@xxxxxxxxx> · Fri, 19 Sep 2008 15:37:26 -0700

Junio C Hamano <gitster@xxxxxxxxx> writes:

> Marcus Griep <marcus@xxxxxxxx> writes:
>
>>  This is a redux of a prior patch as part of a series on count-objects
>>  but is now split off and submitted on its own as an RFC for a library
>>  function to be added to strbuf.
>
> Ok, so I looked at the patch again.
>...
> For example, if I hand 1638 to you, you would give 1.599Ki back to me, but
> if I give you only 4 digits to work with, you do not want to say 1.59Ki;
> instead you would rather say 1.60Ki, right?

This also takes me back to this part...

>/*
> * strbuf_append_human_readable
> *
> * 'val': value to be metrically-reduced to a human-readable number
> * 'maxlen': maximum number of characters to be taken up by the reduced 'val'
> *           not including the sign or magnitude (i.e. 'Ki') characters;
> *           when 'maxlen' is 0 length is controled by 'scale'
> * 'scale': when 'val' is greater than 'scale', 'val' is reduced by the
> *          period (default 1024, see 'flags') until it is less than 'scale';
> *          when 'scale' is 0, 'val' is reduced until it fits in 'maxlen';
> *          when 'scale' and 'maxlen' are both zero, 'scale' defaults to 1000
> * 'flags': HR_USE_SI: uses a period of 1000 and uses SI magnitude prefixes
> *          HR_SPACE: inserts a space between the reduced 'val' and the units
> *          HR_PAD_UNIT: instead of an empty string for singles, pads with
> *                       spaces to the length of the magnitude prefixes
> *
> * Returns 0 if 'val' is successfully reduced and fits in 'maxlen', otherwise
> * returns -n where n is the number of additional characters necessary to
> * fully fit the reduced value.
> */

The lines are overlong but that is not my main complaint.

I do not see how the above definition of "scale" can be useful.

First, for the case of scale=0, if I tell you to format 1638 into 7 spaces
(without HR_PAD_UNIT), you could give me one of:

           1638
        1.560Ki
        0.002Mi
        0.000Gi
        ...

It is clear that 1638 is better than 1.560Ki which in turn is better than
0.002Mi, because the earlier ones give the information in better precision
with a shorter string than the later ones.  Typically the smaller unit
would give you the better answer but does that match "reduced repeatedly
until it fits in 'maxlen'" rule?  I do not think so ("1638  " is more
precise than "1.56Ki", even with HR_PAD_UNIT on).

Second, why is the default of "scale" independent from "period"?  Scale
that defaults to 1000 in binary system would mean that you would give

        ...
   998  998.0000
   999  999.0000
  1000  0.976562Ki
  1001  0.977539Ki
        ...

I have this feeling that the "scale" knob does not match well with human
expectations.  Admittedly, "most precision within the allocated space"
rule may not match human expectation either (e.g. when you have a
sequence of three numbers "520", "1048", "9999" and you are expecting to
see approximation in binary system, you would probably expect to see
0.51k, 1.02k, and 9.76k, instead of the original integers), but I think it
at least matches human expectations better than the "scale" system.

By the way, about the internal math you do for "repeatedly reduce", I do
not think repeatedly dividing with "period" is kosher.  I'd very much
prefer seeing something like:

	double magnitude = fabs(value);
        double scaler;

        for (unit = 0, scaler = 1.0;
             scaler < magnitude && unit < max_units;
             unit++)
             scaler *= period;

	printf("%f%s", value / scaler, unit_names[unit]);

modulo the terminating condition --- it may make more sense to stop one
step before "scaler < magnitude" happens so that we won't see "0.9765Ki"
but instead see "1000".

After thinking about this longer, I unfortunately have to say that the
only thing I like about the current iteration of the patch is the name of
the function "strbuf_append_human_readable" and what goal (at the abstract
level) it wants to achieve.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html