> Date: Mon, 15 Oct 2007 20:45:02 -0400 (EDT) > From: Daniel Barkalow <barkalow@xxxxxxxxxxxx> > cc: Alex Riesen <raa.lkml@xxxxxxxxx>, Johannes.Schindelin@xxxxxx, ae@xxxxxx, > tsuna@xxxxxxxxxxxxx, git@xxxxxxxxxxxxxxx, make-w32@xxxxxxx > > I believe the hassle is that readdir doesn't necessarily report a README in > a directory which is supposed to have a README, when it has a readme > instead. Sorry I'm asking potentially stupid questions out of ignorance: why would you want readdir to return `README' when you have `readme'? > I think we want O(n) comparison of sorted lists, which doesn't > work if equivalent names don't sort the same. You comparison function should be case-insensitive on Windows, or am I missing something? > > > - no acceptable level of performance in filesystem and VFS (readdir, > > > stat, open and read/write are annoyingly slow) > > > > With what libraries? Native `stat' and `readdir' are quite fast. > > Perhaps you mean the ported glibc (libgw32c), where `readdir' is > > indeed painfully slow, but then you don't need to use it. > > We want getting stat info, using readdir to figure out what files exist, > for 106083 files in 1603 directories with a hot cache to take under 1s; > otherwise "git status" takes a noticeable amount of time with a medium-big > project, and we want people to be able to get info on what's changed > effectively instantly. My impression is that Windows' native stat and > readdir are plenty fast for what normal Windows programs want, but we > actually expect reasonable performance on an unreasonably-big > metadata-heavy input. If that's the issue, then it's not a good idea to call `stat' and `readdir' on Windows at all. `stat' is a single system call on Posix systems, while on Windows it usually needs to go out of its way calling half a dozen system services to gather the `struct stat' info. You need to call something like FindFirstFile, which can do the job of `stat' and `readdir' together (and of `fnmatch', if you need to filter only some files) in one go. I don't know whether this will scan 100K files under one second (maybe I will try it one of these days), but it will definitely be faster than `readdir'+`stat' by maybe as much as an order of magnitude. > > > - no real "mmap" (which kills perfomance and complicates code) > > > > You only need mmap because you are accustomed to use it on GNU/Linux. > > I believe the need here is quick setup and fast access to sparse portions > of several 100M files. It's hard to beat a page fault for read speed. If you need memory-mapped files, they are available on Windows. I thought the original comment about `mmap' was because it was used to allocate memory, not read files into memory. > We also expect to be able to make a sequence of file system operations > such that programs starting at any time see the same database as the files > containing the database get restructured. Sorry, I don't understand this; please tell more about the operations, ``the same database'' issue (what database?) and what do you mean by ``the files containing the database get restructured''. > A unixy pipeline was convenient Windows supports pipelines with almost 100% the same functionality as Posix. Again, perhaps I'm missing something. - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html