On Sun, Dec 17, 2006 at 01:51:38PM -0800, Ulrich Drepper wrote: > Matthew Wilcox wrote: > >I know that the rsync load is a major factor on kernel.org right now. > > That should be quite easy to quantify then. Move the readdir and stat > call next to each other in the sources, pass the struct stat around if > necessary, and then count the stat calls which do not originate from the > stat following the readdir call. Of course we'll also need the actual > improvement which can be achieved by combining the calls. Given the > inodes are cached, is there more overhead then finding the right inode? > Note that is rsync doesn't already use fstatat() it should do so and > this means then that there is no long file path to follow, all file > names are local to the directory opened with opendir(). > > My but feeling is that the improvements are minimal for normal (not > cluster etc) filesystems and hence the improvements for kernel.org would > be minimal. I don't think the overhead of finding the right inode or the system calls themselves makes a difference at all. E.g. the rsync numbers I listed spend more than 0.3ms per stat syscall. That kind of time is not spent in looking up kernel datastructures - it's spent doing IO. That part that I think is important (and please correct me if I've gotten it all wrong) is to do the IO in parallel. This applies both to local filesystems and clustered filesystems, allthough it would probably be much more significant for clustered filesystems since they would typically have longer latency for each roundtrip. Today there is no good way for an application to stat many files in parallel. You could do it through threading, but with significant overhead and complexity. I'm curious what results one would get by comparing performance of: * application doing readdir and then stat on every single file * application doing readdirplus * application doing readdir and then stat on every file using a lot of threads or an asyncronous stat interface As far as parallel IO goes, I would think that async stat would be nearly as fast as readdirplus? For the clustered filesystem case there may be locking issues that makes readdirplus faster? -- Ragnar Kjørstad Software Engineer Scali - http://www.scali.com Scaling the Linux Datacenter - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html