Hi Junio, On Sun, 14 Aug 2016, Junio C Hamano wrote: > Johannes Schindelin <johannes.schindelin@xxxxxx> writes: > > > - test=$(echo "$cmd" | sed -e 's|[/ ][/ ]*|_|g') > > + test=$(echo "$cmd" | sed -e 's|[/ ][/ ]*|_|g' -e 'y/>/_/') > > The existing sed scriptlet says "we cannot have slash and do not > want to have space in filename, so we squash runs of them to a > single underscore". If you have more characters that you do not > want, you should add that to the existing set instead. No, we cannot do that. I even mentioned it in my commit message why: We have to take particular care not to confound the existing conversion of unwanted characters to underscores with the new substitution of '>': the existing conversion chose to collapse runs of multiple unwanted characters into a single underscore. If we allowed '>' to be collapsed, too, the file name generated from the command "diff [...]=-- [...]" would be identical to the one generated from "diff [...]=--> [...]". And as there is exactly this case (two command-lines, differing only in the '>' character), doing what you suggested would *break* the test since it would now look at the wrong file. I know that this is so because my first iteration of the patch did exactly what you suggested. > While you are at it, it may be sensible to add a colon to that set, too, > no? > > Something like this, perhaps? > > > - test=$(echo "$cmd" | sed -e 's|[/ ][/ ]*|_|g') > > + test=$(echo "$cmd" | sed -e 's|[/ <:>][/ <:>]*|_|g') Maybe. But what other characters are missing? Those are not the only ones illegal on Windows: [/ \0-\x1f"*:<>?\\|] would be a more complete version of what you suggested. Except that it is not a valid basic regex. But is that really necessary? I think not, for the following reasons: - my patch was specifically designed to stop my CI from pestering me about a totally broken revision that cannot even be checked out (and causes subsequent problems because of it), - as such, my patch was meant not to be an all-encompassing solution to the problem of filenames that are illegal on Windows, but really a tiny patch that you could apply as quickly as possible so that my CI jobs would stop pestering me (which they did not, because `pu` is still broken), - I even meant this little patch to be so small that it can be easily squashed into Jacob's patch, - I do not want to complicate regular expressions unless *really* needed (and you have to admit that we do not need to address any more characters than the '>' *right now*), as - regular expressions are not exactly an easy meal, so it makes sense to keep them as simple as possible both for contributors' as well as for maintainers' sake, and - when trying to come up with a super-complete solution, it is really easy not only to spend waaaaay too much time on it, but also to introduce bugs that are not spotted for a loooong time because nothing actually exercises the newly introduced code, and - If It Ain't Broke, Don't Fix It. - the broader solution must come separately, and not as a mere add-on to one test case: we really need to ensure that such file names do not enter Git's source code. I will send my proposal to address the larger issue in a moment, in the meantime I *beg* you to add this here patch as a quick fix to my CI woes, either by squashing it into the indicated commit, or by appending it to the topic branch. I do not care which one, as long as `pu` gets fixed. Thanks, Dscho -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html