[PATCH v4] Extend sample pre-commit hook to check for non ascii filenames

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



At the moment non-ascii encodings of filenames are not portably
converted between different filesystems by git. This will most likely
change in the future but to allow repositories to be portable among
different file/operating systems this check is enabled by default.

Signed-off-by: Heiko Voigt <hvoigt@xxxxxxxxxx>
---

Thanks for all comments. I now hopefully have a satisfying patch.


On Mon, May 18, 2009 at 12:40:09PM +0200, Johannes Sixt wrote:
> Heiko Voigt schrieb:
> > +	if ! git diff --cached --name-only --diff-filter=A -z \
> > +	   | tr "\0" "\n" | is_ascii; then
> 
> Will this not fail to add more than one file with allowed names? The \n is
> not removed in is_ascii(), and so the resulting string will not be empty.
> 
> BTW, not all tr work well with NULs. See the commit message of e85fe4d8,
> for example. Otherwise, I would have suggested to convert the NUL to some
> allowed ASCII character, e.g. 'A'. BTW, you should really use '\0' and
> '\n' (single-quotes) to guarantee that the shell does not ignore the
> backslash.

I removed all \0 characters and hopefully use the correct platform
independent syntax as described in the commits you send.


On Mon, May 18, 2009 at 02:04:08PM +0200, Johannes Sixt wrote:
> Heiko Voigt schrieb:
> > Are there any problems with '\0' and tr other than swallowing of it.
> 
> I can't tell. But the commits ae90e16..aab0abf are interesting to study in
> w.r.t. portability.
> 
> > In
> > case not I would just change
> > 
> > 	tr "\0" "\n"
> > to
> >   	tr -d '\0'
> 
> In which case I'd suggest that you call tr only once, in isascii():
> 
>      tr -d '[ -~]\0'

After reading a little about the portability things. This seems to be
the right way and is now included.


On Mon, May 18, 2009 at 07:42:31AM -0700, Junio C Hamano wrote:
> Heiko Voigt <hvoigt@xxxxxxxxxx> writes:
> 
> > +if [ "$allownonascii" != "true" ]
> > +then
> > +	# until git can handle non-ascii filenames gracefully
> > +	# prevent them to be added into the repository
> 
> I think you can inline your is_ascii shell function in the pipeline below.
> You made it a separate function and I agree that it has a very good
> documentation value, but the mention of "non-ascii filenames" in this
> comment here is enough clue to let anybody know what is going on.

I agree. I thought it would probably be useful in other places but we
just need it once so its inlined now.

> 
> 	Side note: I am not sure "Until ... can ... gracefully" is a good
> 	description of the general problem.  It probably is more neutral
> 	to say "Cross platform projects tend to avoid non-ascii filenames;
>         prevent them from being added to the repository."

Changed that.

> 
> > +	if ! git diff --cached --name-only --diff-filter=A -z \
> > +	   | tr "\0" "\n" | is_ascii; then
> 
> A standard trick while writing a long pipeline in shell is to change line
> after a pipe, like:
> 
> 	cmd1 |
>         cmd2 |
>         cmd3
> 
> which allows you to lose the BS-before-LF sequence.

Wasn't aware of that. Changed it accordingly.


On Mon, May 18, 2009 at 09:35:19PM +0100, Julian Phillips wrote:
> On Mon, 18 May 2009, Heiko Voigt wrote:
>> +		echo "Error: Preventing to add a non-ascii filename."
>
> This would read better as:
>
> +		echo "Error: Attempt to add a non-ascii filename."
>
> (after all the prevention itself is a result of the error, not the cause  
> of it)

That really sounds better. Thanks.

 templates/hooks--pre-commit.sample |   25 +++++++++++++++++++++++++
 1 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/templates/hooks--pre-commit.sample b/templates/hooks--pre-commit.sample
index 0e49279..ad892a2 100755
--- a/templates/hooks--pre-commit.sample
+++ b/templates/hooks--pre-commit.sample
@@ -7,6 +7,31 @@
 #
 # To enable this hook, rename this file to "pre-commit".
 
+# If you want to allow non-ascii filenames set this variable to true.
+allownonascii=$(git config hooks.allownonascii)
+
+# Cross platform projects tend to avoid non-ascii filenames; prevent
+# them from being added to the repository. We exploit the fact that the
+# printable range starts at the space character and ends with tilde.
+if [ "$allownonascii" != "true" ] &&
+	test "$(git diff --cached --name-only --diff-filter=A -z |
+	  LC_ALL=C tr -d '[ -~]\0')"
+then
+	echo "Error: Attempt to add a non-ascii filename."
+	echo
+	echo "This can cause problems if you want to work together"
+	echo "with people on other platforms than you."
+	echo
+	echo "To be portable it is adviseable to rename the file ..."
+	echo
+	echo "If you know what you are doing you can disable this"
+	echo "check using:"
+	echo
+	echo "  git config hooks.allownonascii true"
+	echo
+	exit 1
+fi
+
 if git-rev-parse --verify HEAD 2>/dev/null
 then
 	against=HEAD
-- 
1.6.3

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]