Re: [PATCH v7 1/5] unpack-objects.c: add dry_run mode for get_data()

Jiang Xin <worldhello.net@xxxxxxxxx> · Wed, 22 Dec 2021 19:15:23 +0800

On Wed, Dec 22, 2021 at 9:53 AM René Scharfe <l.s.r@xxxxxx> wrote:
>
> Am 21.12.21 um 15:09 schrieb Ævar Arnfjörð Bjarmason:
> > Maybe I'm misunderstanding this, but the commit message says it would be
> > dangerous to allocate a very larger buffer, but here we only limit the
> > size under "dry_run".
>
> This patch reduces the memory usage of dry runs, as its commit message
> says.  The memory usage of one type of actual (non-dry) unpack is reduced
> by patch 5.
>

For Han Xin and me, it is very challenging to write better commit log
in English.  Since the commit is moved to the beginning, the commit
log should be rewritten as follows:

unpack-objects.c: low memory footprint for get_data() in dry_run mode

As the name implies, "get_data(size)" will allocate and return a given
size of memory. Allocating memory for a large blob object may cause the
system to run out of memory. Before preparing to replace calling of
"get_data()" to resolve unpack issue of large blob objects, refactor
"get_data()" to reduce memory footprint for dry_run mode. Because
in dry_run mode, "get_data()" is only used to check the integrity of
data, and the returned buffer is not used at all.

Therefore, add the flag "dry_run" as an additional parameter of
"get_data()" and reuse a small buffer in dry_run mode. Because in
dry_run mode, the return buffer is not the entire data that the user
wants, for this reason, we will release the buffer and return NULL.

Han Xin, I think you can try to free the allocated buffer for dry_run
mode inside "get_data()".

--
Jiang Xin