Re: [PATCH] Fix to avoid high memory footprint

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jul 16, 2024 at 08:03:59AM +0000, Haritha  via GitGitGadget wrote:

> From: D Harithamma <harithamma.d@xxxxxxx>
> 
> This fix avoids high memory footprint when
> adding files that require conversion.
> Git has a trace_encoding routine that prints trace
> output when GIT_TRACE_WORKING_TREE_ENCODING=1 is
> set. This environment variable is used to debug
> the encoding contents.
> When a 40MB file is added, it requests close to
> 1.8GB of storage from xrealloc which can lead
> to out of memory errors.
> However, the check for
> GIT_TRACE_WORKING_TREE_ENCODING is done after
> the string is allocated. This resolves high
> memory footprints even when
> GIT_TRACE_WORKING_TREE_ENCODING is not active.
> This fix adds an early exit to avoid the
> unnecessary memory allocation.
> 
> Signed-off-by: Haritha D <harithamma.d@xxxxxxx>

Good find. Any trace function should verify that tracing is enabled
before doing any substantial work.

Let's take a look at your patch. First, your line wrapping is unusual,
making the commit message a bit hard to read. We'd usually shoot for ~72
characters per line. So more like:

> This fix avoids high memory footprint when adding files that require
> conversion.  Git has a trace_encoding routine that prints trace output
> when GIT_TRACE_WORKING_TREE_ENCODING=1 is set. This environment
> variable is used to debug the encoding contents.  When a 40MB file is
> added, it requests close to 1.8GB of storage from xrealloc which can
> lead to out of memory errors.  However, the check for
> GIT_TRACE_WORKING_TREE_ENCODING is done after the string is allocated.
> This resolves high memory footprints even when
> GIT_TRACE_WORKING_TREE_ENCODING is not active.  This fix adds an early
> exit to avoid the unnecessary memory allocation.

Second, we'd like a full real name in the Signed-off-by line, as you're
agreeing to the DCO. See:

  https://git-scm.com/docs/SubmittingPatches#sign-off

Likewise, the author name should match the signoff name (you can use
"git commit --amend --author=..." to fix it).

For the patch itself:

> --- a/convert.c
> +++ b/convert.c
> @@ -324,6 +324,11 @@ static void trace_encoding(const char *context, const char *path,
>  	struct strbuf trace = STRBUF_INIT;
>  	int i;
>  
> +	// If tracing is not on, exit early to avoid high memory footprint
> +	if (!trace_pass_fl(&coe)) {
> +		return;
> +	}

I don't think trace_pass_fl() is what you want. It will return true if
the trace fd is non-zero (so tracing was requested), but also if the key
has not yet been initialized (i.e., nobody has used this key to try
printing anything yet).

I think you'd just use trace_want(&coe) instead.

Also, two style nits:

 - our usual style (see Documentation/CodingGuidelines) is to avoid
   braces for one-liners.

 - we only use the /* */ comment form, not //. Though IMHO you could
   skip the comment completely here, as an early-return check in a
   tracing function is pretty obvious.

It would be nice if we could test this, but besides the wasted work, I
don't think there's any user-visible behavior (the problem is that we
are computing things when we're _not_ tracing, so there's nothing for
the user to see). And there's no provision in our test suite for
measuring memory usage of a program. So I think we can live without it,
and just manually verifying that it works (but it would be good to show
the measurements you did manually in the commit message).

-Peff




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux