As with unpack-objects in a preceding commit have update-index.c make use of the HASH_N_OBJECTS{,_{FIRST,LAST}} flags. We now have a "batch" mode again for "update-index". Adding the t/* directory from git.git on a Linux ramdisk is a bit faster than with the tmp-objdir indirection: git hyperfine -L rev ns/batched-fsync,HEAD -s 'make CFLAGS=-O3' -p 'rm -rf repo && git init repo && cp -R t repo/' 'git ls-files -- t | ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' --warmup 1 -r 10 Benchmark 1: git ls-files -- t | ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'ns/batched-fsync Time (mean ± σ): 289.8 ms ± 4.0 ms [User: 186.3 ms, System: 103.2 ms] Range (min … max): 285.6 ms … 297.0 ms 10 runs Benchmark 2: git ls-files -- t | ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'HEAD Time (mean ± σ): 273.9 ms ± 7.3 ms [User: 189.3 ms, System: 84.1 ms] Range (min … max): 267.8 ms … 291.3 ms 10 runs Summary 'git ls-files -- t | ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'HEAD' ran 1.06 ± 0.03 times faster than 'git ls-files -- t | ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'ns/batched-fsync' And as before running that with "strace --summary-only" slows things down a bit (probably mimicking slower I/O a bit). I then get: Summary 'git ls-files -- t | strace --summary-only ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'HEAD' ran 1.21 ± 0.02 times faster than 'git ls-files -- t | strace --summary-only ./git -c core.fsync=loose-object -c core.fsyncMethod=batch -C repo update-index --add --stdin' in 'ns/batched-fsync' We also go from ~51k syscalls to ~39k, with ~2x the number of link() and unlink() in ns/batched-fsync. In the process of doing this conversion we lost the "bulk" mode for files added on the command-line. I don't think it's useful to optimize that, but we could if anyone cared. We've also converted this to a string_list, we could walk with getline_fn() and get one line "ahead" to see what we have left, but I found that state machine a bit painful, and at least in my testing buffering this doesn't harm things. But we could also change this to stream again, at the cost of some getline_fn() twiddling. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> --- builtin/update-index.c | 31 +++++++++++++++++++++++++++---- 1 file changed, 27 insertions(+), 4 deletions(-) diff --git a/builtin/update-index.c b/builtin/update-index.c index af02ff39756..c7cbfe1123b 100644 --- a/builtin/update-index.c +++ b/builtin/update-index.c @@ -1194,15 +1194,38 @@ int cmd_update_index(int argc, const char **argv, const char *prefix) } if (read_from_stdin) { + struct string_list list = STRING_LIST_INIT_NODUP; struct strbuf line = STRBUF_INIT; struct strbuf unquoted = STRBUF_INIT; + size_t i, nr; + unsigned oflags; setup_work_tree(); - while (getline_fn(&line, stdin) != EOF) - line_from_stdin(&line, &unquoted, prefix, prefix_length, - nul_term_line, set_executable_bit, 0); + while (getline_fn(&line, stdin) != EOF) { + size_t len = line.len; + char *str = strbuf_detach(&line, NULL); + + string_list_append_nodup(&list, str)->util = (void *)len; + } + + nr = list.nr; + oflags = nr > 1 ? HASH_N_OBJECTS : 0; + for (i = 0; i < nr; i++) { + size_t nth = i + 1; + unsigned f = i == 0 ? HASH_N_OBJECTS_FIRST : + nr == nth ? HASH_N_OBJECTS_LAST : 0; + struct strbuf buf = STRBUF_INIT; + struct string_list_item *item = list.items + i; + const size_t len = (size_t)item->util; + + strbuf_attach(&buf, item->string, len, len); + line_from_stdin(&buf, &unquoted, prefix, prefix_length, + nul_term_line, set_executable_bit, + oflags | f); + strbuf_release(&buf); + } strbuf_release(&unquoted); - strbuf_release(&line); + string_list_clear(&list, 0); } if (split_index > 0) { -- 2.35.1.1428.g1c1a0152d61