On Thu, Jul 25, 2024 at 01:01:45PM +0900, KwonHyun Kim wrote: > I am experimenting with git and I found there is something not working > as explain in the document > > When I place `text_[가나].txt` in `.gitignore` it does not ignore > text_가.txt nor text_나.txt > > I experimented with `text_[ab].txt` and it works fine. > > So I thought it might work bytewise so I put > `text_[\200-\352][\200-\352][\200-\352].txt` with no effect. (가 is > "\352\260\200" when core.quotepath is set to true) > > So I think it must be a bug that is that pattern [abc] or [a-z] does > not incorporate non-ascii characters. but I am not sure. The globbing in git is generally done by wildmatch.c, which was imported from rsync. Looking in that file, it looks like it does not support multi-byte characters at all inside brackets. So I don't see a way to make it work except to place the _literal_ bytes making up the utf8 sequence, each inside its own single-byte match. Like: printf 'text_[\352\353][\260\202][\200\230].txt\n' >.gitignore But then your .gitignore file is itself invalid utf8 (not to mention that this is obviously something a user shouldn't have to do). So I guess the fix would be to teach wildmatch.c to recognize and match multi-byte sequences inside []. That probably requires that we assume the pattern and the path are utf8, which will usually be true, but not always. So we might need some kind of config switch there. There are also probably a deep rabbit hole of corner cases there (e.g., NFD vs NFC, matching é versus "e" + combining accent). But I suspect that even recognizing multi-byte sequences as a single char to match would be big improvement. -Peff