On Mon, Feb 06 2023, Derrick Stolee wrote: > On 2/5/2023 4:12 PM, Ævar Arnfjörð Bjarmason wrote: > [...] >> One wonders if (even for this index-related code) we really need such >> careful management of growth, and could instead do with: >> >> strbuf_setlen(tree_path, base_path_len); >> strbuf_add(tree_path, entry.path, entry.pathlen); >> strbuf_addch(tree_path, '/'); > > This would be my preferred way to go here. *nod* >> Or even just: >> >> strbuf_addf(tree_path, "%*.s/", (int)entry.pathlen, entry.path); > > Please do not add "addf" functions that can be run in tight loops. > It's faster to do strbuf_add() followed by strbuf_addch(). Good point. I wondered just how much slower, and it's up to 3x! At least according to this[1] artificial test case (where I usurped a random test helper). I wondered if we could just handle some common strbuf_addf() cases ourselves, and the benchmark shows (manually annotated, too lazy to set up the -n option): git hyperfine -L rev HEAD~5,HEAD~4,HEAD~3,HEAD~2,HEAD~1,HEAD~0 -s 'make CFLAGS=-O3' './t/helper/test-tool online-cpus' -r 3 [...] Summary './t/helper/test-tool online-cpus' in 'HEAD~0' ran <== strbuf_add() + strbuf_addch() 1.06 ± 0.11 times faster than './t/helper/test-tool online-cpus' in 'HEAD~1' <== strbuf_addstr() + strbuf_addch() 1.18 ± 0.12 times faster than './t/helper/test-tool online-cpus' in 'HEAD~4' <== hand optimized strbuf_addf() for "%sC" 1.33 ± 0.18 times faster than './t/helper/test-tool online-cpus' in 'HEAD~2' <== hand optimized strbuf_addf() for "%*sC" 2.63 ± 0.05 times faster than './t/helper/test-tool online-cpus' in 'HEAD~5' <== strbuf_addf("%s/") 2.92 ± 0.25 times faster than './t/helper/test-tool online-cpus' in 'HEAD~3' <== strbuf_addf("%*s/") The "hand optimization" just being a very stupid handling of "%sC" for arbitrary values of a single char "C", and ditto for "%*sC" (which curiously is slower here). So, for truly hot loops we'd still want to use the add + addch, but if anyone's interested (hashtag #leftoverbits) it looks like we could get some easy wins (and reduction in code size, as we could stop worrying about addf being slow in most cases) with some very dumb minimal vaddf(), which could handle these cases (but not anything involving padding etc.). I didn't dig, but wouldn't be surprised if the reason is that C libraries need to carry a relatively fat & general sprintf() for all those edge cases, locale handling etc, whereas most of our use could trivially be represented as some sequence of addstr()/addf() etc. Another interesting approach (and this is very #leftoverbits) would be to perform the same optimization with coccinelle. I.e. our current use of it is purely "this code X should be written like Y, and we should commit Y". But there's no reason for why we couldn't effectively implement our own compiler optimizations for our own APIs with it, so just grab "%s/" etc, unpack that in OCaml, then emit strbuf_add() + strbuf_addch(), and that would be what the C compiler would see. 1. 9d23ffb1117 addf + nolen diff --git a/t/helper/test-online-cpus.c b/t/helper/test-online-cpus.c index 8cb0d53840f..c802ec579d0 100644 --- a/t/helper/test-online-cpus.c +++ b/t/helper/test-online-cpus.c @@ -1,9 +1,17 @@ #include "test-tool.h" #include "git-compat-util.h" #include "thread-utils.h" * +#include "strbuf.h" int cmd__online_cpus(int argc, const char **argv) { - printf("%d\n", online_cpus()); + struct strbuf sb = STRBUF_INIT; + const char *const str = "Hello, World"; + + for (size_t i = 0; i < 10000000; i++) { + strbuf_reset(&sb); + strbuf_addf(&sb, "%s/", str); + puts(sb.buf); + } return 0; } 9f74eff5623 addf + nolen optimize diff --git a/strbuf.c b/strbuf.c index c383f41a3c5..750e5e6a5b4 100644 --- a/strbuf.c +++ b/strbuf.c @@ -332,8 +332,16 @@ void strbuf_addchars(struct strbuf *sb, int c, size_t n) void strbuf_addf(struct strbuf *sb, const char *fmt, ...) { va_list ap; + va_start(ap, fmt); - strbuf_vaddf(sb, fmt, ap); + if (*fmt == '%' && *(fmt + 1) == 's' && *(fmt + 2) && !*(fmt + 3)) { + const char *arg = va_arg(ap, const char *); + + strbuf_addstr(sb, arg); + strbuf_addch(sb, *(fmt + 2)); + } else { + strbuf_vaddf(sb, fmt, ap); + } va_end(ap); } ca60bb9b479 addf + len diff --git a/t/helper/test-online-cpus.c b/t/helper/test-online-cpus.c index c802ec579d0..7257e622015 100644 --- a/t/helper/test-online-cpus.c +++ b/t/helper/test-online-cpus.c @@ -7,10 +7,11 @@ int cmd__online_cpus(int argc, const char **argv) { struct strbuf sb = STRBUF_INIT; const char *const str = "Hello, World"; + const size_t len = strlen(str); for (size_t i = 0; i < 10000000; i++) { strbuf_reset(&sb); - strbuf_addf(&sb, "%s/", str); + strbuf_addf(&sb, "%*s/", (int)len, str); puts(sb.buf); } return 0; 1f47987d095 addf + len optimize diff --git a/strbuf.c b/strbuf.c index 750e5e6a5b4..88801268f7a 100644 --- a/strbuf.c +++ b/strbuf.c @@ -334,11 +334,16 @@ void strbuf_addf(struct strbuf *sb, const char *fmt, ...) va_list ap; va_start(ap, fmt); - if (*fmt == '%' && *(fmt + 1) == 's' && *(fmt + 2) && !*(fmt + 3)) { + if (*fmt == '%' && + *(fmt + 1) == '*' && + *(fmt + 2) == 's' && + *(fmt + 3) && + !*(fmt + 4)) { + int len = va_arg(ap, int); const char *arg = va_arg(ap, const char *); - strbuf_addstr(sb, arg); - strbuf_addch(sb, *(fmt + 2)); + strbuf_add(sb, arg, len); + strbuf_addch(sb, *(fmt + 3)); } else { strbuf_vaddf(sb, fmt, ap); } 55c698c0b95 addstr diff --git a/t/helper/test-online-cpus.c b/t/helper/test-online-cpus.c index 7257e622015..2716b44ca15 100644 --- a/t/helper/test-online-cpus.c +++ b/t/helper/test-online-cpus.c @@ -7,11 +7,11 @@ int cmd__online_cpus(int argc, const char **argv) { struct strbuf sb = STRBUF_INIT; const char *const str = "Hello, World"; - const size_t len = strlen(str); for (size_t i = 0; i < 10000000; i++) { strbuf_reset(&sb); - strbuf_addf(&sb, "%*s/", (int)len, str); + strbuf_addstr(&sb, str); + strbuf_addch(&sb, '/'); puts(sb.buf); } return 0; b17fb99bf7e (HEAD -> master) add diff --git a/t/helper/test-online-cpus.c b/t/helper/test-online-cpus.c index 2716b44ca15..5e52b622c4d 100644 --- a/t/helper/test-online-cpus.c +++ b/t/helper/test-online-cpus.c @@ -7,10 +7,11 @@ int cmd__online_cpus(int argc, const char **argv) { struct strbuf sb = STRBUF_INIT; const char *const str = "Hello, World"; + const size_t len = strlen(str); for (size_t i = 0; i < 10000000; i++) { strbuf_reset(&sb); - strbuf_addstr(&sb, str); + strbuf_add(&sb, str, len); strbuf_addch(&sb, '/'); puts(sb.buf); }