>On Mon, Aug 08, 2022 at 08:35:21AM +0000, Li Linchao via GitGitGadget wrote: > >> The '--disk-usage' option for git-rev-list was introduced in 16950f8384 >> (rev-list: add --disk-usage option for calculating disk usage, 2021-02-09). >> This is very useful for people inspect their git repo's objects usage >> infomation, but the resulting number is quit hard for a human to read. >> >> Teach git rev-list to output a human readable result when using >> '--disk-usage'. > >OK. When adding --disk-usage, I never really dreamed people would use it >for human output, since "du .git" is usually a suitable approximation. :) >But I don't have any real objection. I'm curious what your use case is >like, if you don't mind sharing. We used it at GitHub for computing >per-fork sizes for analysis, etc (so the result was always fed into >another script). Sometimes, I need objects disk size for analysis too, but I do this by hand not by script. And I also noticed that git-count-objects has an option '-H' for better output, so I think if it could be applied to "git-rev-list --disk-usage". > >> Documentation/rev-list-options.txt | 5 +++- >> builtin/rev-list.c | 42 ++++++++++++++++++++++++------ >> t/t6115-rev-list-du.sh | 22 ++++++++++++++++ >> 3 files changed, 60 insertions(+), 9 deletions(-) > >The patch itself looks pretty sensible (and thanks Ævar for the first >round of review; the suggestions there all looked good). A few small >comments: > >> @@ -481,8 +485,13 @@ static int try_bitmap_disk_usage(struct rev_info *revs, >> if (!bitmap_git) >> return -1; >> >> - printf("%"PRIuMAX"\n", >> - (uintmax_t)get_disk_usage_from_bitmap(bitmap_git, revs)); >> + size_from_bitmap = get_disk_usage_from_bitmap(bitmap_git, revs); >> + if (human_readable) >> + strbuf_humanise_bytes(&disk_buf, size_from_bitmap); >> + else >> + strbuf_addf(&disk_buf, "%"PRIuMAX"", (uintmax_t)size_from_bitmap); >> + puts(disk_buf.buf); >> + strbuf_release(&disk_buf); > >It's not a lot of duplicated lines, but since it is implementing policy >logic, I think it would be nice to move the formatting decision into a >function. Something like: > > static void show_disk_usage(off_t size) > { > struct strbuf sb = STRBUF_INIT; > if (human_readable) > strbuf_humanise_bytes(&sb, size); > else > strbuf_addf(&sb, "%"PRIuMAX, (uintmax_t)size_from_bitmap); > puts(sb.buf); > strbuf_release(&sb); > } > >and then you can call it from here, and from the non-bitmap path below. Oh, this is really good. > >(Also, while typing it out, I noticed that you don't need the extra "" >after PRIuMAX; that just concatenates an empty string). > >> - if (!strcmp(arg, "--disk-usage")) { >> - show_disk_usage = 1; >> - info.flags |= REV_LIST_QUIET; >> - continue; >> + if (skip_prefix(arg, "--disk-usage", &arg)) { >> + if (*arg == '=') { >> + if (!strcmp(++arg, "human")) { >> + human_readable = 1; >> + show_disk_usage = 1; >> + info.flags |= REV_LIST_QUIET; >> + continue; >> + } else >> + die(_("invalid value for '%s': '%s', try --disk-usage=human"), "--disk-usage", arg); >> + } else { >> + show_disk_usage = 1; >> + info.flags |= REV_LIST_QUIET; >> + continue; >> + } >> } > >We can put the common parts of each side of the conditional into the >outer block to avoid repeating ourselves. Also, your code matches >--show-disk-usage-without-an-equals, since it nows uses skip_prefix(). >You could fix that by checking for '\0' in *arg. So together, something >like: > > if (skip_prefix(arg, "--disk-usage", &arg)) { > if (*arg == '=') { > if (!strcmp(++arg, "human")) > human_readable = 1; > else > die(...); > } else if (*arg) { > /* > * Arguably should goto a label to continue chain of ifs? > * Doesn't matter unless we try to add --disk-usage-foo > * afterwards > */ > usage(rev_list_usage); > } > show_disk_usage = 1; > info.flags |= REV_LIST_QUIET; > continue; > } Thank you a lot for your very nice suggestions. > >-Peff