Re: [PATCH 2/3] exclude: do strcmp as much as possible before fnmatch

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Nguyen Thai Ngoc Duy <pclouds@xxxxxxxxx> writes:

>> I have been wondering if you can take a different approach based on the
>> same observation this patch is based on.  If you see an entry /foo/bar/*.c
>> in the top-level .gitignore, perhaps you can set it aside in a different
>> part of "struct exclude" for the top-level directory (because the pattern
>> will never match outside foo/bar directory), so that it is not even used
>> for matching, and only when you descend to foo/bar directory, add "/*.c"
>> to the "struct exclude" you create for that directory.
>
> that part is "base" field in "struct exclude", I believe.

Sorry, I misspoke; it is not about struct exclude at all.

>> That way, instead of "strcmp is faster than fnmatch, but we always compare
>> all elements in the huge pattern list given at the toplevel", you would be
>> doing "we do not even bother to compare with the elements we know do not
>> matter", which would be far more efficient, no?
>
> You still have to do at least one strncmp on "base" though to know if
> a pattern is applicable to the given directory. So it's not really
> cheaper than what is done in 3/3.

Actually I was referring to the exclude_stack.

Suppose you have .gitignore file at the top that lists /foo/bar/*.c
(among other millions of patterns anchored to specific directory),
and another in the foo/bar directory.  When you are looking at a
path in the top-level, currently the exclude_stack would have one
element, per-directory one for .gitignore at the top, that has
millions of patterns that would never match.  And then when you
descend into foo/bar directory, prep_exclude would link two elements
(one for foo/ directory which may be empty, another for foo/bar
directory) to this, and then you check paths you see in foo/bar
directory using all the elements that appear in the exclude_stack.

What I was suggesting was that you could choose not to add
/foo/bar/*.c entry in the exclude_stack element for the top-level
(but remember you did so), and then inside prep_exclude() when you
look at different directory, e.g. foo/bar, notice that higher level
(i.e. toplevel in this example) has such a deferred patterns that
applies to the new directory.  Then instead of adding /foo/bar/*.c
at the top-level, you can pretend as if /*.c appeared in .gitignore
file in the deeper level in the hierarchy.

And this does not happen per path you check; exclude_stack used by
excluded() is designed to take advantage of the access pattern that
we tend to check paths from the same directory together, so such an
adjustment will be per directory switching (i.e. it will be part of
the prep_exclude() overhead that is amortized over paths you walk).
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]