Re: Possible bug in .gitignore

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 25, 2024 at 01:01:45PM +0900, KwonHyun Kim wrote:

> I am experimenting with git and I found there is something not working
> as explain in the document
> 
> When I place `text_[가나].txt` in `.gitignore` it does not ignore
> text_가.txt nor text_나.txt
> 
> I experimented with `text_[ab].txt` and it works fine.
> 
> So I thought it might work bytewise so I put
> `text_[\200-\352][\200-\352][\200-\352].txt` with no effect. (가 is
> "\352\260\200" when core.quotepath is set to true)
> 
> So I think it must be a bug that is that pattern [abc] or [a-z] does
> not incorporate non-ascii characters. but I am not sure.

The globbing in git is generally done by wildmatch.c, which was imported
from rsync. Looking in that file, it looks like it does not support
multi-byte characters at all inside brackets.

So I don't see a way to make it work except to place the _literal_ bytes
making up the utf8 sequence, each inside its own single-byte match.
Like:

  printf 'text_[\352\353][\260\202][\200\230].txt\n' >.gitignore

But then your .gitignore file is itself invalid utf8 (not to mention
that this is obviously something a user shouldn't have to do).

So I guess the fix would be to teach wildmatch.c to recognize and match
multi-byte sequences inside []. That probably requires that we assume
the pattern and the path are utf8, which will usually be true, but not
always. So we might need some kind of config switch there.

There are also probably a deep rabbit hole of corner cases there (e.g.,
NFD vs NFC, matching é versus "e" + combining accent). But I suspect
that even recognizing multi-byte sequences as a single char to match
would be big improvement.

-Peff




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux