Re: [PATCH v2 1/1] read-cache.c: optimize reading index format v4

Junio C Hamano <gitster@xxxxxxxxx> · Tue, 04 Sep 2018 12:31:54 -0700

Nguyễn Thái Ngọc Duy  <pclouds@xxxxxxxxx> writes:

> Index format v4 requires some more computation to assemble a path
> based on a previous one. The current code is not very efficient
> because
>
>  - it doubles memory copy, we assemble the final path in a temporary
>    first before putting it back to a cache_entry
>
>  - strbuf_remove() in expand_name_field() is not exactly a good fit
>    for stripping a part at the end, _setlen() would do the same job
>    and is much cheaper.
>
>  - the open-coded loop to find the end of the string in
>    expand_name_field() can't beat an optimized strlen()
>
> This patch avoids the temporary buffer and writes directly to the new
> cache_entry, which addresses the first two points. The last point
> could also be avoided if the total string length fits in the first 12
> bits of ce_flags, if not we fall back to strlen().
>
> Running "test-tool read-cache 100" on webkit.git (275k files), reading
> v2 only takes 4.226 seconds, while v4 takes 5.711 seconds, 35% more
> time. The patch reduces read time on v4 to 4.319 seconds.
>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@xxxxxxxxx>
> ---
>  read-cache.c | 128 ++++++++++++++++++++++++---------------------------
>  1 file changed, 60 insertions(+), 68 deletions(-)

Thanks; this round is much easier to read with a clearly named
"expand_name_field" boolean variable, etc.