On Mon, Sep 16, 2024 at 10:50:16AM +0200, Patrick Steinhardt wrote: > This makes t1419 work with the "reftable" backend with some slight > modifications. Of course it also speeds up listing of references with > hidden refs. The following benchmark prints one reference with 1 million > hidden references: > > Benchmark 1: HEAD~ > Time (mean ± σ): 93.3 ms ± 2.1 ms [User: 90.3 ms, System: 2.5 ms] > Range (min … max): 89.8 ms … 97.2 ms 33 runs > > Benchmark 2: HEAD > Time (mean ± σ): 4.2 ms ± 0.6 ms [User: 2.2 ms, System: 1.8 ms] > Range (min … max): 3.1 ms … 8.1 ms 765 runs > > Summary > HEAD ran > 22.15 ± 3.19 times faster than HEAD~ Very nice. > + /* > + * When the reference name is lexicographically bigger than the > + * current exclude pattern we know that it won't ever match any > + * of the following references, either. We thus advance to the > + * next pattern and re-check whether it matches. > + * > + * Otherwise, if it's smaller, then we do not have a match and > + * thus want to show the current reference. > + */ > + cmp = strncmp(iter->ref.refname, pattern, > + iter->exclude_patterns_strlen); > + if (cmp > 0) { > + iter->exclude_patterns_index++; > + iter->exclude_patterns_strlen = 0; > + continue; > + } > + if (cmp < 0) > + return 0; Perhaps I am showing my ignorance for the reftable backend, but is it OK to call strncmp() against all patterns here? In the packed-refs implementation which I worked on with Peff sometime last year, we rejected exclude patterns that contain special glob characters in them for exactly this reason. The implementation that I'm referring to has a helpful comment that jogged my memory for what we were thinking at the time (see the function refs/packed-backend.c::populate_excluded_jump_list()). Ugh, I just read the next hunk below, so ignore me here ;-). > +static char **filter_exclude_patterns(const char **exclude_patterns) > +{ > + size_t filtered_size = 0, filtered_alloc = 0; > + char **filtered = NULL; > + > + if (!exclude_patterns) > + return NULL; > + > + for (size_t i = 0; ; i++) { > + const char *exclude_pattern = exclude_patterns[i]; > + int has_glob = 0; > + > + if (!exclude_pattern) > + break; > + > + for (const char *p = exclude_pattern; *p; p++) { > + has_glob = is_glob_special(*p); > + if (has_glob) > + break; > + } > + if (has_glob) > + continue; > + > + ALLOC_GROW(filtered, filtered_size + 1, filtered_alloc); > + filtered[filtered_size++] = xstrdup(exclude_pattern); > + } > + > + if (filtered_size) { > + QSORT(filtered, filtered_size, qsort_strcmp); > + ALLOC_GROW(filtered, filtered_size + 1, filtered_alloc); > + filtered[filtered_size++] = NULL; > + } > + > + return filtered; > +} Ohhh, here's where we filter out un-excludeable patterns. Ignore me :-). One question I had reading this is why we don't filter these out on the fly in the iterator itself instead of allocating a separate array that we have to xstrdup() into and free later on. We may be at the point of diminishing returns here, but I wonder if allocating this thing is more expensive than a few redundant strcmp()s and calls to is_glob_special(). I dunno. Thanks, Taylor