Re: [PATCH 2/2] commit-graph: turn off save_commit_buffer

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Sep 07, 2019 at 01:04:40AM -0400, Jeff King wrote:
> The commit-graph tool may read a lot of commits, but it only cares about
> parsing their metadata (parents, trees, etc) and doesn't ever show the
> messages to the user. And so it should not need save_commit_buffer,
> which is meant for holding onto the object data of parsed commits so
> that we can show them later. In fact, it's quite harmful to do so.
> According to massif, the max heap of "git commit-graph write
> --reachable" in linux.git before/after this patch (removing the commit
> graph file in between) goes from ~1.1GB to ~270MB.
>
> Which isn't surprising, since the difference is about the sum of the
> uncompressed sizes of all commits in the repository, and this was
> equivalent to leaking them.
>
> This obviously helps if you're under memory pressure, but even without
> it, things go faster. My before/after times for that command (without
> massif) went from 12.521s to 11.874s, a speedup of ~5%.
>
> Signed-off-by: Jeff King <peff@xxxxxxxx>
> ---
> We didn't actually notice this on linux.git, but rather on a repository
> with 130 million commits (don't ask). With this patch, I was able to
> generate the commit-graph file with a peak heap of ~25GB, which is ~200
> bytes per commit.
>
> I'll bet we could do better with some effort, but obviously this case
> was just pathological. For most cases this should be cheaper than a
> normal repack (which probably spends that much memory on each object,
> not just commits).
>
>  builtin/commit-graph.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
> index 57863619b7..052696f1af 100644
> --- a/builtin/commit-graph.c
> +++ b/builtin/commit-graph.c
> @@ -251,6 +251,8 @@ int cmd_commit_graph(int argc, const char **argv, const char *prefix)
>  			     builtin_commit_graph_usage,
>  			     PARSE_OPT_STOP_AT_NON_OPTION);
>
> +	save_commit_buffer = 0;
> +

This looks exactly right to me. We had discussed a little bit off-list
about where you might place this line, but I think that the spot you
picked is perfect: as late as possible.

Thankfully, the option parsing code here doesn't load any commits
(though even if it did, I don't think that turning on/off
'save_commit_buffer' would really make much of a difference).

So, the patch here looks obviously correct, and I don't think it needs a
test or anything like that... besides: what is there to test? :).

>  	if (argc > 0) {
>  		if (!strcmp(argv[0], "read"))
>  			return graph_read(argc, argv);
> --
> 2.23.0.474.gb1abd76f7a

Thanks,
Taylor



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux