On Wed, 13 May 2009, Linus Torvalds wrote: > > Of course, it probably makes sense to have a whole "git_readdir()" that > does this thing in general. Actually, the more I think about that, the less true I think it is. It _sounds_ like a nice simplification ("just do it once in readdir, and forget about it everywhere else"), but it's in fact a stupid thing to do. Why? If we _ever_ want to fix this in the general case, then the code that does the readdir() will actually have to remember both the "raw filesystem" form _and_ the "cleaned-up utf-8 form". Why? Because when we do readdir(), we'll also do 'lstat()' on the end result to check the types, and opendir() in case it's a directory and we then want to do things recursively etc. And that happens to work on OS X (because we can use our "fixed" filename for lstat too), but it does not work in the general case. And you can say "well, just do the stat inside the wrapped readdir()", but that doesn't work _either_, since - we don't want to do the lstat() if it's unnecessary. Even if we don't have "de->d_type" information, we can often avoid the need for it, if we can tell that the name isn't interestign (due to being ignored). Avoiding the lstat is a huge performance issue for cold-cache cases. It's basically a seek. So we really want to do the lstat() later, which implies that the caller needs to know _both_ the original "real" filesystem name _and_ the converted one. - it doesn't handle the opendir() case anyway - so the end result is that a real implementation will _always_ need to carry around both the "filesystem view" filename _and_ the "what we've converted it into". Now, the point of the patch I sent out was that for the specific case of OS X, which does UTF-8 conversions (wrong) but also is happy to get our properly normalized name, we don't care. So my patch is "correct" for that special case - and so would a plain readdir() wrapper be. But my patch is _also_ correct for the case where a readdir() wrapper would do the wrong thing. My patch doesn't _handle_ it (since it doesn't change the code to pass both "filesystem view" and "cleaned-up view" pathnames), but the patch I sent out also doesn't make it any harder to do right. In contrast, doing a readdir() wrapper makes it much harder to do right later, because it's just doing the conversion at the wrong level (you could make that "wrapper" return both the original and the fixed filename, but at that point the wrapper doesn't really help - you might as well just have the "convert" function, and it would be a hell of a lot more obvious what is really going on). So I take it back. A readdir() wrapper is not a good idea. It gets us a tiny bit of the way, but it would actually take us a step back from the "real" solution. Linus -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html