Feature request: better error messages when UTF-8 bites

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi;

Just found an annoyance in `git log` (and likely elsewhere) that may warrant a change:

Somehow when copying and pasting a commit from a website to the command line, a UTF-8 Byte Order Mark (BOM) [https://en.wikipedia.org/wiki/Byte_order_mark] was appended to one of the commit ids. BOMs are invisible, as are many other UTF-8 code points. The upshot was that Git didn't like it, and complained bitterly:

$ strace -etrace=execve -s 200 git diff 038179704f0066aa815d5429221cf381ff4ef289 47346a462d8ba40b9a8b073e351c362522c46aa6

execve("/usr/bin/git", ["git", "diff", "038179704f0066aa815d5429221cf381ff4ef289\357\273\277", "47346a462d8ba40b9a8b073e351c362522c46aa6"], 0x7fffec3c4bb0 /* 80 vars */) = 0

fatal: ambiguous argument '038179704f0066aa815d5429221cf381ff4ef289': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
+++ exited with 128 +++

Feature request:
================

When printing the "fatal: ambiguous argument '......': ....", perhaps escape (url or otherwise) the ambiguous argument when printing it in the error message, or maybe add a sentence about non-ASCII characters being found.

This is sort of a difficult corner-case, in that it is perfectly legal to have UTF-8 characters in a branch or tag name (see git-check-ref-format for the allowed characters), so someone could indeed create a branch named "038179704f0066aa815d5429221cf381ff4ef289\357\273\277" if they were a tortured soul bent on overthrowing polite society. Rejecting input because it has bytes with values above \177 is therefore not a solution.

Similarly, scanning the input for invisible UTF-8 characters (or even invalid UTF-8 sequences) is leaning too far the other way: git should not be validating character encodings. It should stay encoding-neutral, as the alternative leads to madness, driving developers into becoming tortured souls bent on rigidly enforcing polite society. We have enough of those already.

It's unclear as to whether violent overthrow or rigid enforcement is the lesser of two evils, but let's not perform the experiment to find out. :-)

Cheers!

--
CH (ch-and-git.vger.kernel.org@xxxxxxxxxx)



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux