Re: [PATCH 3/4] dir.c: support marking some patterns already matched

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 16, 2016 at 6:47 AM, Junio C Hamano <gitster@xxxxxxxxx> wrote:
> Nguyễn Thái Ngọc Duy  <pclouds@xxxxxxxxx> writes:
>
>> Given path "a" and the pattern "a", it's matched. But if we throw path
>> "a/b" to pattern "a", the code fails to realize that if "a" matches
>> "a" then "a/b" should also be matched.
>>
>> When the pattern is matched the first time, we can mark it "sticky", so
>> that all files and dirs inside the matched path also matches. This is a
>> simpler solution than modify all match scenarios to fix that.
>
> I am not quite sure what this one tries to achieve.  Is this a
> performance thing, or is it a correctness thing?
>
> "This is a simpler solution than" is skimpy on the description of
> what the solution is.

As an example, let's look at pattern "a/". For this pattern to match,
"a" must be a directory. When we examine "a", we can stat() and see if
it is a directory. But when we descend inside, say "a/b", current code
is not prepared to deal with it and will try to stat("a/b") to see if
it's a directory instead of stat("a").

> When you see 'a' and path 'a/', you would throw it in the sticky
> list.  when you descend into 'a/' and see things under it,
> e.g. 'a/b', you would say "we have a match" because 'a' is sticky.
> Do you throw 'a/b' also into the sticky list so that you would catch
> 'a/b/c' later?

No because "a/" can still do the job. I know it's a bit hard to see
because add_sticky() is not used yet in this patch.

> Do you rely on the order of tree walking to cull
> entries from the sticky list that are no longer relevant?
> e.g. after you enumerate everything in 'a/b', you do not need 'a/b'
> anymore.

No explicit culling. But notice that these sticky lists are indirectly
contained in exclude_list. Suppose "a/b" is sticky because of
"a/.gitignore". As soon as we move out of "a", I would expect
prep_exclude() to remove the exclude_list of "a/.gitignore" and the
related sticky lists.

> Or do you notice that 'a/' matched at the top-level and stop
> bothering the sticky list when you descend into 'a/b' and others?
>
> How does this interact with negative patterns?

Negative or not is irrelevant. If "a" is matched negatively and we see
"a/b", we return the same response, "negative match".

> Thanks.  The data structure used in 3/4 smells iffy from performance
> point of view--have you tried it on a large trees with deep nesting?

No I haven't. Though I don't expect it to degrade performance much. We
basically add one new exclude rule when add_sticky() is called and pay
the price of pathname matching of that rule. If there are a lot of
sticky rules (especially ones at top directory), then performance can
go to hell. 4/4 should not add many of them though.
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]