Re: NFSv4/pNFS possible POSIX I/O API standards

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Dec 06, 2006 at 10:13:36AM -0800, Ulrich Drepper wrote:
> Ragnar Kjørstad wrote:
> >I guess the code needs to be checked, but I would think that:
> >* ls
> >* find
> >* rm -r
> >* chown -R
> >* chmod -R
> >* rsync
> >* various backup software
> >* imap servers
> 
> Then somebody do the analysis.

This is by no means a full analysis, but maybe someone will find it
useful anyway. All performance tests are done with a directory tree with
the lkml archive in maildir format on a local ext3 filesystem. The
numbers are systemcall walltime, seen through strace. 


I think Andreas already wrote that "ls --color" is the default in most
distributions and needs to stat every file.

ls --color -R kernel_old:
82.27% 176.37s  0.325ms  543332 lstat
17.61%  37.75s  5.860ms    6442 getdents64
 0.04%   0.09s  0.018ms    4997 write
 0.03%   0.06s 55.462ms       1 execve
 0.02%   0.04s  5.255ms       8 poll


"find" is already smart enough to not call stat when it's not needed,
and make use of d_type when it's available. But in many cases stat is
still needed (such as with -user)

find kernel_old -not -user 1002:
83.63% 173.11s  0.319ms  543338 lstat
16.31%  33.77s  5.242ms    6442 getdents64
 0.03%   0.06s 62.882ms       1 execve
 0.01%   0.03s  6.904ms       4 poll
 0.01%   0.02s  8.383ms       2 connect

rm was a false alarm. It only uses stat to check for directories, and
it's already beeing smart about it, not statting directories with
n_links==2.


chown uses stat to:
* check for directories / symlinks / regular files
* Only change ownership on files with a specific existing ownership.
* Only change ownership if the requested owner does not match the
  current owner. 
* Different output when ownership is actually changed from when it's
  not necessary (in verbose mode).
* Reset S_UID, S_GID options after setting ownership in some cases.
but it seems the most recent version will not use stat for every file
with typical options:

chown -R rk kernel_old:
93.30% 463.84s  0.854ms  543337 lchown
 6.67%  33.18s  5.151ms    6442 getdents64
 0.01%   0.04s  0.036ms    1224 brk
 0.00%   0.02s  5.830ms       4 poll
 0.00%   0.02s  0.526ms      38 open


chmod needs stat to do things like "u+w", but the current implementation
uses stat regardless of if it's needed or not.
chmod -R o+w kernel_old:
62.50% 358.84s  0.660ms  543337 chmod
30.66% 176.05s  0.324ms  543336 lstat
 6.82%  39.17s  6.081ms    6442 getdents64
 0.01%   0.05s 54.515ms       1 execve
 0.01%   0.05s  0.037ms    1224 brk

chmod -R 0755 kernel_old:
61.21% 354.42s  0.652ms  543337 chmod
30.33% 175.61s  0.323ms  543336 lstat
 8.46%  48.96s  7.600ms    6442 getdents64
 0.01%   0.05s  0.037ms    1224 brk
 0.00%   0.01s 13.417ms       1 execve


Seems I was wrong about the imap servers. They (at least dovecot) do not
use a significant amount of time doing stat when opening folders:
84.90%  24.75s  13.137ms    1884 writev
11.23%   3.27s 204.675ms      16 poll
 0.95%   0.28s   0.023ms   11932 open
 0.89%   0.26s   0.022ms   12003 pread
 0.76%   0.22s  12.239ms      18 getdents64
 0.63%   0.18s   0.015ms   11942 close
 0.63%   0.18s   0.015ms   11936 fstat


I don't think any code inspection is needed to determine that rsync
requires stat of every file, regardless of d_type.

Initial rsync:
rsync -a kernel_old copy
78.23% 2914.59s  5.305ms  549452 read
 6.69%  249.17s  0.046ms 5462876 write
 4.82%  179.44s  0.330ms  543338 lstat
 4.57%  170.33s  0.313ms  543355 open
 4.13%  153.79s  0.028ms 5468732 select

rsync on identical directories:
rsync -a kernel_old copy
61.81% 189.27s  0.348ms  543338 lstat
25.23%  77.25s 15.917ms    4853 select
12.72%  38.94s  6.045ms    6442 getdents64
 0.19%   0.57s  0.118ms    4840 write
 0.03%   0.08s  3.736ms      22 open

tar cjgf incremental kernel_backup.tar kernel_old/
67.69% 2463.49s  3.030ms  812948 read
22.94% 834.85s  2.565ms  325471 write
 7.51% 273.45s  0.252ms 1086675 lstat
 0.94%  34.25s  2.658ms   12884 getdents64
 0.35%  12.63s  0.023ms  543370 open

incremental:
81.71% 171.62s  0.316ms  543342 lstat
16.81%  35.32s  2.741ms   12884 getdents64
 1.40%   2.94s  1.930ms    1523 write
 0.04%   0.09s 86.668ms       1 wait4
 0.02%   0.03s 34.300ms       1 execve




> And please an analysis which takes into 
> account that some programs might need to be adapted to take advantage of 
> d_type or non-optional data from the proposed statlite.


d_type may be useful in some cases, but I think mostly as a replacement
for the nlink==2 hacks for directory recursion. There are clearly many
stat-heavy examples that can not be optimized with d_type.


> Plus, how often are these commands really used on such filesystems?  I'd 
> hope that chown -R or so is a once in a lifetime thing on such 
> filesystems and not worth optimizing for.

I think you're right about chown/chmod beeing rare and should not be the
main focus. The other examples on my list is probably better. And they
are just examples - there are probably many many others as well.

And what do you mean by "such filesystems"? I know this came up in the
context of clustered filesystems, but unless I'm missing something
fundamentally here readdirplus could be just as useful on local
filesystems as clustered filesystems if it allowed parallel execution of
the getattrs.

-- 
Ragnar Kjørstad
Software Engineer
Scali - http://www.scali.com
Scaling the Linux Datacenter
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux