Re: Cross-Platform Version Control

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Wed, 13 May 2009, Linus Torvalds wrote:
>
> Of course, it probably makes sense to have a whole "git_readdir()" that 
> does this thing in general.

Actually, the more I think about that, the less true I think it is.

It _sounds_ like a nice simplification ("just do it once in readdir, and 
forget about it everywhere else"), but it's in fact a stupid thing to do.

Why?

If we _ever_ want to fix this in the general case, then the code that does 
the readdir() will actually have to remember both the "raw filesystem" 
form _and_ the "cleaned-up utf-8 form".

Why? Because when we do readdir(), we'll also do 'lstat()' on the end 
result to check the types, and opendir() in case it's a directory and we 
then want to do things recursively etc. And that happens to work on OS X 
(because we can use our "fixed" filename for lstat too), but it does not 
work in the general case.

And you can say "well, just do the stat inside the wrapped readdir()", but 
that doesn't work _either_, since

 - we don't want to do the lstat() if it's unnecessary. Even if we don't 
   have "de->d_type" information, we can often avoid the need for it, if 
   we can tell that the name isn't interestign (due to being ignored).

   Avoiding the lstat is a huge performance issue for cold-cache cases. 
   It's basically a seek.

   So we really want to do the lstat() later, which implies that the 
   caller needs to know _both_ the original "real" filesystem name _and_ 
   the converted one.

 - it doesn't handle the opendir() case anyway - so the end result is that 
   a real implementation will _always_ need to carry around both the 
   "filesystem view" filename _and_ the "what we've converted it into".

Now, the point of the patch I sent out was that for the specific case of 
OS X, which does UTF-8 conversions (wrong) but also is happy to get our 
properly normalized name, we don't care. So my patch is "correct" for that 
special case - and so would a plain readdir() wrapper be.

But my patch is _also_ correct for the case where a readdir() wrapper 
would do the wrong thing. My patch doesn't _handle_ it (since it doesn't 
change the code to pass both "filesystem view" and "cleaned-up view" 
pathnames), but the patch I sent out also doesn't make it any harder to do 
right.

In contrast, doing a readdir() wrapper makes it much harder to do right 
later, because it's just doing the conversion at the wrong level (you 
could make that "wrapper" return both the original and the fixed 
filename, but at that point the wrapper doesn't really help - you might 
as well just have the "convert" function, and it would be a hell of a lot 
more obvious what is really going on).

So I take it back. A readdir() wrapper is not a good idea. It gets us a 
tiny bit of the way, but it would actually take us a step back from the 
"real" solution.

			Linus
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]