Hi,
I have a gluster 3.10 volume with a dir with ~1 million small files in
them, say mounted at /mnt/dir with FUSE, and I'm observing something weird:
When I list and stat them all using rsync, then the lstat() calls that
rsync does are incredibly fast (23 microseconds per call on average,
definitely faster than a network roundtrip between my 3-machine bricks
connected via Ethernet).
But when I try to back them up with the `bup` tool
(
https://github.com/bup/bup), which (at least according to strace) does
the same syscalls as rsync to stat all files, it takes 1700 microseconds
per lstat() call, and the total time to go over all files is 50x higher.
These are my strace statistics:
rsync strace:
$ strace -f -w -c rsync -a --dry-run /mnt/dir /tmp/nonexisting
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
52.35 69.682773 26617 2618 1 select
39.60 52.712292 1056 49907 getdents
8.00 10.655949 11 998102 lstat
0.02 0.022890 12 1900 read
0.01 0.017219 21 829 write
0.01 0.012093 14 868 munmap
0.01 0.006656 11 606 mmap
0.00 0.004168 1389 3 readlink
0.00 0.001796 1796 1 chdir
0.00 0.001019 510 2 clone
0.00 0.000841 44 19 13 open
Took ~50 seconds real time to complete.
bup strace (I interrupted it after a while):
strace: Process 10749 attached
strace: Process 10750 attached
Indexing: 25600 (566 paths/s)
^C
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
89.55 1140.837325 1700 671016 lstat
3.92 49.875376 934 53387 getdents
3.58 45.655985 52238 874 read
2.14 27.293944 789 34588 5799 open
0.57 7.266342 141 51384 llistxattr
0.09 1.090689 42 25692 25692 getxattr
0.06 0.780977 26 29825 1019 close
0.05 0.601806 23 25739 25722 ioctl
0.03 0.373894 1851 202 select
0.00 0.055953 14 3879 brk
real 20m52.150s
user 0m2.608s
sys 0m11.456s
Note I passed `-c -w` to strace to measure wall time of the syscalls
spend, not system CPU time.
Using
time strace -f -c -w ls -lU /mnt/dir > /dev/null
shows that ls is as fast as rsync (also in 50 seconds).
(Aside: I've filed the large number of getdents() as
https://bugzilla.redhat.com/show_bug.cgi?id=1478411 but that's not the
core question of this email.)
My volume info:
# gluster vol info
Volume Name: myvol
Type: Replicate
Volume ID: ...
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 10.0.0.1:/data/glusterfs/brick
Brick2: 10.0.0.2:/data/glusterfs/brick
Brick3: 10.0.0.3:/data/glusterfs/brick
Options Reconfigured:
changelog.capture-del-path: on
changelog.changelog: on
storage.build-pgfid: on
nfs.disable: on
transport.address-family: inet
client.ssl: on
server.ssl: on
storage.linux-aio: on
performance.io-thread-count: 64
performance.readdir-ahead: on
server.event-threads: 32
client.event-threads: 32
server.outstanding-rpc-limit: 64
cluster.lookup-unhashed: auto
performance.flush-behind: on
performance.strict-write-ordering: off
performance.high-prio-threads: 64
performance.normal-prio-threads: 64
performance.low-prio-threads: 64
performance.write-behind-window-size: 10MB
cluster.ensure-durability: on
performance.lazy-open: yes
cluster.use-compound-fops: off
performance.open-behind: on
features.cache-invalidation: off
performance.quick-read: off
performance.read-ahead: off
performance.stat-prefetch: off
changelog.rollover-time: 1
cluster.self-heal-daemon: enable
Questions:
What could explain why bup's lstat()s are slow and rsync's lstat()s are
fast?
Also, how comes rsync's lstat()s can be faster than a network roundtrip
at all, given that I have all caching disabled (e.g. stat-prefetch: off,
cache-invalidation: off)?
Is there some caching going on? Might bup issue its syscalls in an order
that might destroy this caching, while rsync and ls use a favourable order?
Thanks!
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxxhttp://lists.gluster.org/mailman/listinfo/gluster-users