time du -ksh binno/
3.7G binno/
real 117m45.733s
user 0m1.635s
sys 0m6.430stime du -ksh binno/
3.7G binno/
real 2m5.595s
user 0m0.767s
sys 0m4.437s
Fop Call Count Avg-Latency Min-Latency Max-Latency
--- ---------- ----------- ----------- -----------
STAT 153 90.72 us 5.00 us 666.00 us
STATFS 3 677.67 us 620.00 us 709.00 us
OPENDIR 149 1213.81 us 519.00 us 28777.00 us
LOOKUP 552 8493.01 us 3.00 us 79689.00 us
READDIRP 3518 5351.76 us 11.00 us 341877.00 us
FORGET 10050351 0 us 0 us 0 us
RELEASE 9062130 0 us 0 us 0 us
RELEASEDIR 5395 0 us 0 us 0 us
------ ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- -----
After update
Interval 8 Stats:
%-latency Avg-latency Min-Latency Max-Latency No. of calls Fop
--------- ----------- ----------- ----------- ------------ ----
0.00 0.00 us 0.00 us 0.00 us 2 RELEASEDIR
0.08 118.00 us 113.00 us 123.00 us 2 STATFS
0.13 190.00 us 189.00 us 191.00 us 2 LOOKUP
0.29 422.00 us 422.00 us 422.00 us 2 OPENDIR
99.49 28539.60 us 1698.00 us 48655.00 us 10 READDIRP
0.00 0.00 us 0.00 us 0.00 us 5217 UPCALL
0.00 0.00 us 0.00 us 0.00 us 5217 CI_FORGET
Duration: 22 seconds
Data Read: 0 bytes
Data Written: 0 bytes
I am not sure about profiling result as I don't understand it correctly.
Thanks
Kashif
Hi Kashif,
Thank you for your feedback! Do you have some data on the nature of performance improvement observed with 3.11 in the new setup?
Adding Raghavendra and Poornima for validation of configuration and help with identifying why certain files disappeared from the mount point after enabling readdir-optimize.
Regards,
Vijay
On 07/11/2017 11:06 AM, mohammad kashif wrote:
<mailto:vbellur@xxxxxxxxxx>> wrote:Hi Vijay and Experts
I didn't want to experiment with my production setup so started a
parallel system with two server and around 80TB storage. First
configured with gluster 3.8 and had the same lookup performance issue.
Then upgraded to 3.11 as you suggested and it made huge improvement in
lookup time. I also did some more optimization as suggested in other
threads.
Now I am going to update my production server. I am planning to use
following optimization option, it would be very useful if you can point
out any inconsistency or suggest some other options. My production setup
has 5 servers consisting of 400TB storage and around 80 million files
of varying lengths.
Options Reconfigured:
server.event-threads: 4
client.event-threads: 4
cluster.lookup-optimize: on
cluster.readdir-optimize: off
performance.client-io-threads: on
performance.cache-size: 1GB
performance.parallel-readdir: on
performance.md-cache-timeout: 600
performance.cache-invalidation: on
performance.stat-prefetch: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
nfs.disable: on
performance.readdir-ahead: on
transport.address-family: inet
auth.allow: 163.1.136.*
diagnostics.latency-measurement: on
diagnostics.count-fop-hits: on
I found that setting cluster.readdir-optimize to 'on' made some files
disappear from client !
Thanks
Kashif
On Sun, Jun 18, 2017 at 4:57 PM, Vijay Bellur <vbellur@xxxxxxxxxx
Hi Mohammad,
A lot of time is being spent in addressing metadata calls as
expected. Can you consider testing out with 3.11 with md-cache [1]
and readdirp [2] improvements?
Adding Poornima and Raghavendra who worked on these enhancements to
help out further.
Thanks,
Vijay
[1] https://gluster.readthedocs.io/en/latest/release-notes/3.9. 0/
<https://gluster.readthedocs.io/en/latest/release-notes/3.9. >0/
[2] https://github.com/gluster/glusterfs/issues/166
<https://github.com/gluster/glusterfs/issues/166 >
On Fri, Jun 16, 2017 at 2:49 PM, mohammad kashif
<kashif.alig@xxxxxxxxx <mailto:kashif.alig@xxxxxxxxx>> wrote: <kashif.alig@xxxxxxxxx <mailto:kashif.alig@xxxxxxxxx>
Hi Vijay
Did you manage to look into the gluster profile logs ?
Thanks
Kashif
On Mon, Jun 12, 2017 at 11:40 AM, mohammad kashif> wrote: <vbellur@xxxxxxxxxx <mailto:vbellur@xxxxxxxxxx>> wrote:
Hi Vijay
I have enabled client profiling and used this script
https://github.com/bengland2/gluster-profile-analysis/blob/m aster/gvp-client.sh
<https://github.com/bengland2/gluster-profile-analysis/blob/ >master/gvp-client.sh
to extract data. I am attaching output files. I don't have
any reference data to compare with my output. Hopefully you
can make some sense out of it.
On Sat, Jun 10, 2017 at 10:47 AM, Vijay Bellur
Would it be possible for you to turn on client profiling
and then run du? Instructions for turning on client
profiling can be found at [1]. Providing the client
profile information can help us figure out where the
latency could be stemming from.
Regards,
Vijay
[1] https://gluster.readthedocs.io/en/latest/Administrator%20Gui de/Performance%20Testing/# client-side-profiling
<https://gluster.readthedocs.io/en/latest/Administrator%20Gu >ide/Performance%20Testing/# client-side-profiling
On Fri, Jun 9, 2017 at 7:22 PM, mohammad kashif
<kashif.alig@xxxxxxxxx <mailto:kashif.alig@xxxxxxxxx>> <vbellur@xxxxxxxxxx <mailto:vbellur@xxxxxxxxxx>> wrote:
wrote:
Hi Vijay
Thanks for your quick response. I am using gluster
3.8.11 on Centos 7 servers
glusterfs-3.8.11-1.el7.x86_64
clients are centos 6 but I tested with a centos 7
client as well and results didn't change
gluster volume info Volume Name: atlasglust
Type: Distribute
Volume ID: fbf0ebb8-deab-4388-9d8a-f722618a624b
Status: Started
Snapshot Count: 0
Number of Bricks: 5
Transport-type: tcp
Bricks:
Brick1: pplxgluster01.x.y.z:/glusteratlas/brick001/gv0
Brick2: pplxgluster02..x.y.z:/glusteratlas/brick002/gv0
Brick3: pplxgluster03.x.y.z:/glusteratlas/brick003/gv0
Brick4: pplxgluster04.x.y.z:/glusteratlas/brick004/gv0
Brick5: pplxgluster05.x.y.z:/glusteratlas/brick005/gv0
Options Reconfigured:
nfs.disable: on
performance.readdir-ahead: on
transport.address-family: inet
auth.allow: x.y.z
I am not using directory quota.
Please let me know if you require some more info
Thanks
Kashif
On Fri, Jun 9, 2017 at 2:34 PM, Vijay Bellur
Can you please provide more details about your
volume configuration and the version of gluster
that you are using?
Regards,
Vijay
On Fri, Jun 9, 2017 at 5:35 PM, mohammad kashif
<kashif.alig@xxxxxxxxx<mailto:Gluster-users@gluster.<mailto:kashif.alig@xxxxxxxxx>> wrote:
Hi
I have just moved our 400 TB HPC storage
from lustre to gluster. It is part of a
research institute and users have very small
files to big files ( few KB to 20GB) . Our
setup consists of 5 servers, each with 96TB
RAID 6 disks. All servers are connected
through 10G ethernet but not all clients.
Gluster volumes are distributed without any
replication. There are approximately 80
million files in file system.
I am mounting using glusterfs on clients.
I have copied everything from lustre to
gluster but old file system exist so I can
compare.
The problem, I am facing is extremely slow
du on even a small directory. Also the time
taken is substantially different each time.
I tried du from same client on a particular
directory twice and got these results.
time du -sh /data/aa/bb/cc
3.7G /data/aa/bb/cc
real 7m29.243s
user 0m1.448s
sys 0m7.067s
time du -sh /data/aa/bb/cc
3.7G /data/aa/bb/cc
real 16m43.735s
user 0m1.097s
sys 0m5.802s
16m and 7m is too long for a 3.7 G
directory. I must mention that the directory
contains huge number of files (208736)
but running du on same directory on old data
gives this result
time du -sh /olddata/aa/bb/cc
4.0G /olddata/aa/bb/cc
real 3m1.255s
user 0m0.755s
sys 0m38.099s
much better if I run same command again
time du -sh /olddata/aa/bb/cc
4.0G /olddata/aa/bb/cc
real 0m8.309s
user 0m0.313s
sys 0m7.755s
Is there anything I can do to improve this
performance? I would also like hear from
some one who is running same kind of setup.
Thanks
Kashif
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxxorg >
http://lists.gluster.org/mailman/listinfo/gluster-users
<http://lists.gluster.org/mailman/listinfo/gluster-users >
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://lists.gluster.org/mailman/listinfo/gluster-users