On Mon, Oct 15, 2018 at 12:57 AM Junio C Hamano <gitster@xxxxxxxxx> wrote: > > Duy Nguyen <pclouds@xxxxxxxxx> writes: > > >> Our matching function comes from rsync originally, whose manpage says: > >> > >> use ’**’ to match anything, including slashes. > >> > >> I believe this is accurate as far as the implementation goes. > > > > No. "**" semantics is not the same as from rsync. The three cases > > "**/", "/**/" and "/**" were requested by Junio if I remember > > correctly. You can search the mail archive for more information. > > Perhaps spelling the rules out would be more benefitial for the > purpose of this thread? I do not recall what I requested, but let OK I gave you too much credit. You pointed out semantics problem with rsync "**" [1] and gently pushed me to implement a safer subset ;-) [1] https://public-inbox.org/git/7vbogj5sji.fsf@xxxxxxxxxxxxxxxxxxxxxxxx/ > me throw out my guesses (i.e. what I would have wished if I were > making a request to implement something) to keep the thread alive, > you can correct me, and people can take it from there to update the > docs ;-) > > A double-asterisk, both of whose ends are adjacent to a > directory boundary (i.e. the beginning of the pattern, the end > of the pattern or a slash) macthes 0 or more levels of > directories. e.g. **/a/b would match a/b, x/a/b, x/y/a/b, but > not z-a/b. a/**/b would match a/b, a/x/b, but not a/z-b or > a-z-b. > > What a double-asterisk that does not sit on a directory boundary, > e.g. "a**b", "a**/b", "a/**b", or "**a/b", matches, as far as I am > concerned, is undefined, meaning that (1) I do not care all that > much what the code actually do when a pattern like that is given as > long as it does not segfault, and (2) I do not think I would mind > changing the behaviour as a "bugfix", if their current behaviour > does not make sense and we can come up with a saner alternative. I think the document describes more or less what you wrote above in the indented paragraph (but maybe less clear, patches are of course welcome). It's the last paragraph that is problematic. It right now says "invalid" which could be interpreted as "bad pattern, rejected" but we in fact accept these "*" are regular ones. There are not many alternatives we could do though. Erroring out could really flood the stderr because we match these patterns zillions of time and traditionally fnmatch gracefully accepts bad patterns, trying to make the best of of them. So keeping undefined "**" as two "*" sounds good enough. But of course I'm open for saner alternatives if people find any. -- Duy