Re: NFSv4/pNFS possible POSIX I/O API standards

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



At 07:57 PM 12/17/2006, Ragnar Kjørstad wrote:
On Sun, Dec 17, 2006 at 01:51:38PM -0800, Ulrich Drepper wrote:
> Matthew Wilcox wrote:
> >I know that the rsync load is a major factor on kernel.org right now.
>
> That should be quite easy to quantify then.  Move the readdir and stat
> call next to each other in the sources, pass the struct stat around if
> necessary, and then count the stat calls which do not originate from the
> stat following the readdir call.  Of course we'll also need the actual
> improvement which can be achieved by combining the calls.  Given the
> inodes are cached, is there more overhead then finding the right inode?
>  Note that is rsync doesn't already use fstatat() it should do so and
> this means then that there is no long file path to follow, all file
> names are local to the directory opened with opendir().
>
> My but feeling is that the improvements are minimal for normal (not
> cluster etc) filesystems and hence the improvements for kernel.org would
> be minimal.

I don't think the overhead of finding the right inode or the system
calls themselves makes a difference at all. E.g. the rsync numbers I
listed spend more than 0.3ms per stat syscall. That kind of time is not
spent in looking up kernel datastructures - it's spent doing IO.

That part that I think is important (and please correct me if I've
gotten it all wrong) is to do the IO in parallel. This applies both to
local filesystems and clustered filesystems, allthough it would probably
be much more significant for clustered filesystems since they would
typically have longer latency for each roundtrip.  Today there is no good
way for an application to stat many files in parallel. You could do it
through threading, but with significant overhead and complexity.

I'm curious what results one would get by comparing performance of:
* application doing readdir and then stat on every single file
* application doing readdirplus
* application doing readdir and then stat on every file using a lot of
  threads or an asyncronous stat interface

We have done something similar to what you suggest.
We wrote a parallel file tree walker to run on clustered file systems that spread the file systems metadata out over multiple disks. The program parallelizes the stat operations across multiple nodes (via MPI). We needed to walk a tree with about a hundred million files in a reasonable amount of time. We cut the time from dozens of hours to less than an hour. We were able to keep all the metadata raids/disks much busier doing the work for the stat operations. We have used this on two different clustered file systems with similar results. In both cases, it scaled with the number of disks the metadata was spread over, not quite linearly but it was a huge win for these two
file systems.

Gary


As far as parallel IO goes, I would think that async stat would be
nearly as fast as readdirplus?
For the clustered filesystem case there may be locking issues that makes
readdirplus faster?


--
Ragnar Kjørstad
Software Engineer
Scali - http://www.scali.com
Scaling the Linux Datacenter


-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux