On 2019-09-02 at 18:29:56, Junio C Hamano wrote: > Phillip Wood <phillip.wood123@xxxxxxxxx> writes: > > > Being picking I'll point out that ':' is not a valid in refs > > either. Looking at > > https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file I > > think only " and | are not allowed on NTFS/FAT but are valid in refs > > (see the man page for git check-ref-format for all the details). So > > the main limitation is actually what git allows in refs. > > Yeah, trying to use the contents of the log message without > sufficient sanitization is looking for trouble. > > >> for (p1 = label.buf; *p1; p1++) > >> - if (isspace(*p1)) > >> + if (!(*p1 & 0x80) && !isalnum(*p1)) > >> *(char *)p1 = '-'; While we're thinking of things that could go wrong, note that it's also possible for the commit message to contain non-UTF-8 characters (if the user is using a non-UTF-8 encoding), which will cause sadness on Windows and macOS. Non-Mac Unix systems aren't a problem here, but then again, they aren't the reason for this patch. > > I'm sightly concerned that this opens the possibility for unexpected > > effects if two different labels get sanitized to the same string. I > > suspect it's unlikely to happen in practice but doing something like > > percent encoding non-alphanumeric characters would avoid the problem > > entirely. > > I see there are "lets make sure it is unique by suffixing "-%d" in > other codepaths; would that help if this piece of code yields a > label that is not unique? I was thinking the same thing. Since we're being much less lenient on what's allowed (which is fine), we're at increased risk for collision. -- brian m. carlson: Houston, Texas, US OpenPGP: https://keybase.io/bk2204
Attachment:
signature.asc
Description: PGP signature