Re: git on MacOSX and files with decomposed utf-8 file names

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Wed, 16 Jan 2008, Linus Torvalds wrote:
> 
> Does it always matter? Hell no. But the problem with a filesystem that 
> thinks it knows better is that when it *sometimes* matters, the filesystem 
> simply DOES THE WRONG THING.
> 
> Can't you understand that?

Side note: there are ways to do it right.

You can:

 - not do conversion at all (which is always right). Not corrupting the 
   user data means that the user never gets something back that he didn't 
   put in

   (And, btw, the "security" argument is total BS. The fact that two 
   characters look the same does not mean that they should act the same, 
   and it is *not* a security feature. Quite the reverse. Having programs 
   that get different results back from what they actually wrote, *that* 
   tends to be a security issue, because now you have a confused program, 
   and I guarantee that there are more bugs in unexpected cases than in 
   the expected ones)

 - Not accept data in formats that you don't like. This is also always 
   right, but can be rather impolite.

 - Not accept data in formats that you don't like, and give people 
   explicit conversion and comparison routines so that they can then make 
   their own decisions and they are *aware* of the conversion (so that 
   they don't come back to the problem of being confused)

So there are certainly many ways to handle things like this.

The one thing you shouldn't do is to silently convert data behind the 
programs back, without even giving any way to disable it (and that disable 
has to be on a use-by-use casis, not some "disable/enable for all users of 
this filesystem", because you can - and do - have different programs that 
have different expectations).

And finally: all of the above is true at *all* levels. It doesn't matter 
one whit whether the automatic conversion conversion is in the kernel or 
in a library. Doing it on a library level has advantages (namely the whole 
"disable/enable" thing tends to get *much* easier to do, and applications 
can decide to link against a particular version to get the behaviour 
*they* want, for example).

So doing it inside the kernel is just about the worst possible case, 
exactly because it makes it really hard to do a "on a case-by-case" basis. 

Yes, Linux does it too, but it does it only for filesystems that are 
*defined* to be insane. OS X really should have known better. Especially 
since they already fixed the applications (ie they do allow for 
case-sensitive filesystems).

I can understand normalization when it's about case-insensitivity (there 
are lots of _technical_ reasons to do it there), but once you let the 
case-insensitivity go, there just isn't any excuse any more.

			Linus
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux