Re: [PATCH v2 08/10] diffcore-rename: add a new idx_possible_rename function

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Feb 24, 2021 at 9:35 AM Derrick Stolee <stolee@xxxxxxxxx> wrote:
>
> On 2/23/2021 6:44 PM, Elijah Newren via GitGitGadget wrote:> +static char *get_dirname(const char *filename)
> > +{
> > +     char *slash = strrchr(filename, '/');
> > +     return slash ? xstrndup(filename, slash-filename) : xstrdup("");
>
> My brain interpreted "slash-filename" as a single token on first
> read, which confused me briefly. Inserting spaces would help
> readers like me.
>
> > +      *   (4) Check if applying that directory rename to the original file
> > +      *       would result in a destination filename that is in the
> > +      *       potential rename set.  If so, return the index of the
> > +      *       destination file (the index within rename_dst).
>
> > +      * This function, idx_possible_rename(), is only responsible for (4).
>
> This helps isolate the important step to care about for the implementation,
> while the rest of the context is important, too.
>
> > +     char *old_dir, *new_dir, *new_path;
> > +     int idx;
> > +
> > +     if (!info->setup)
> > +             return -1;
> > +
> > +     old_dir = get_dirname(filename);
> > +     new_dir = strmap_get(&info->dir_rename_guess, old_dir);
> > +     free(old_dir);
> > +     if (!new_dir)
> > +             return -1;
> > +
> > +     new_path = xstrfmt("%s/%s", new_dir, get_basename(filename));
>
> This is running in a loop, so `xstrfmt()` might be overkill compared
> to something like
>
>         strbuf_addstr(&new_path, new_dir);
>         strbuf_addch(&new_path, '/');
>         strbuf_addstr(&new_path, get_basename(filename));
>
> but maybe the difference is too small to notice. (notice the type
> change to "struct strbuf new_path = STRBUF_INIT;")

Ooh, nice find.  Since this is in a loop over the renames as you point
out, this is an O(N) improvement (with N = number of renames) rather
than an O(1) improvement.  It does turn out to be hard to notice,
though.  Since we still have some O(N^2) code (all the inexact rename
detection for which our exact- and basename-guided detection
optimizations can't handle), with that N^2 actually being multiplied
by the average number of lines in the given files, this improvement
does seem to mostly get lost in the noise.

I tried a bunch of times to measure the performance with these
changes.  After a bunch of runs, it seems that this optimization saves
somewhere between 3-10ms (depending on which testcase, whether at this
point in the series or at the very end, etc.).  It's hard to pin down,
because the savings is less than the standard deviation of any given
sets of runs.  I don't think it's big enough to warrant restating the
performance measurements, but I'm very happy to include this
suggestion in my reroll.

>
> > +
> > +     idx = strintmap_get(&info->idx_map, new_path);
> > +     free(new_path);
> > +     return idx;
> > +}
>
> Does what it says it does.
>
> Thanks,
> -Stolee



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux