Re: Extremely slow du

mohammad kashif <kashif.alig@xxxxxxxxx> · Mon, 19 Jun 2017 18:40:16 +0100

Hi Vijay

Thanks, is it straight forward to upgrade from 2.8 to 2.11 on  a production system? I have around 300 TB data and approximately  60-80 million files. Is there any other optimisation which I can try at the same time?

Thanks

Kashif 

On Sun, Jun 18, 2017 at 4:57 PM, Vijay Bellur <vbellur@xxxxxxxxxx> wrote:
Hi Mohammad,
A lot of time is being spent in addressing metadata calls as expected. Can you consider testing out with 3.11 with md-cache [1] and readdirp [2] improvements?

Adding Poornima and Raghavendra who worked on these enhancements to help out further.

Thanks,
Vijay

[1] https://gluster.readthedocs.io/en/latest/release-notes/3.9.0/

[2] https://github.com/gluster/glusterfs/issues/166

On Fri, Jun 16, 2017 at 2:49 PM, mohammad kashif <kashif.alig@xxxxxxxxx> wrote:
Hi Vijay

Did you manage to look into the gluster profile logs ?

Thanks

Kashif 

On Mon, Jun 12, 2017 at 11:40 AM, mohammad kashif <kashif.alig@xxxxxxxxx> wrote:
Hi Vijay

I have enabled client profiling and used this script  https://github.com/bengland2/gluster-profile-analysis/blob/master/gvp-client.sh  to extract data. I am attaching output files. I don't have  any reference data to compare with my output. Hopefully you can make some sense out of it. 

On Sat, Jun 10, 2017 at 10:47 AM, Vijay Bellur <vbellur@xxxxxxxxxx> wrote:
Would it be possible for you to turn on client profiling and then run du? Instructions for turning on client profiling can be found at [1]. Providing the client profile information can help us figure out where the latency could be stemming from.
Regards,
Vijay

[1] https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Performance%20Testing/#client-side-profiling

On Fri, Jun 9, 2017 at 7:22 PM, mohammad kashif <kashif.alig@xxxxxxxxx> wrote:
Hi Vijay

Thanks for your quick response. I am using gluster 3.8.11 on  Centos 7 servers
glusterfs-3.8.11-1.el7.x86_64

clients are centos 6 but I tested with a centos 7 client as well and results didn't change

 gluster volume info

Volume Name: atlasglust

Type: Distribute

Volume ID: fbf0ebb8-deab-4388-9d8a-f722618a624b

Status: Started

Snapshot Count: 0

Number of Bricks: 5

Transport-type: tcp

Bricks:

Brick1: pplxgluster01.x.y.z:/glusteratlas/brick001/gv0

Brick2: pplxgluster02..x.y.z:/glusteratlas/brick002/gv0

Brick3: pplxgluster03.x.y.z:/glusteratlas/brick003/gv0

Brick4: pplxgluster04.x.y.z:/glusteratlas/brick004/gv0

Brick5: pplxgluster05.x.y.z:/glusteratlas/brick005/gv0

Options Reconfigured:

nfs.disable: on

performance.readdir-ahead: on

transport.address-family: inet

auth.allow: x.y.z

I am not using directory quota.

Please let me know if you require some more info

Thanks

Kashif

On Fri, Jun 9, 2017 at 2:34 PM, Vijay Bellur <vbellur@xxxxxxxxxx> wrote:
Can you please provide more details about your volume configuration and the version of gluster that you are using?
Regards,
Vijay

On Fri, Jun 9, 2017 at 5:35 PM, mohammad kashif <kashif.alig@xxxxxxxxx> wrote:
Hi

I have just moved our 400 TB HPC storage from lustre to gluster. It is part of a research institute and users have very small files to  big files ( few KB to 20GB) . Our setup consists of 5 servers, each with 96TB RAID 6 disks. All servers are connected through 10G ethernet but not all clients.  Gluster volumes are distributed without any replication. There are approximately 80 million files in file system.
I am mounting using glusterfs on  clients.

I have copied everything from lustre to gluster but old file system exist so I can compare.

The problem, I am facing is extremely slow du on even a small directory. Also the time taken is substantially different each time.  
I tried du from same client on  a particular directory twice and got these results. 

  time du -sh /data/aa/bb/cc

3.7G    /data/aa/bb/cc

real    7m29.243s

user    0m1.448s

sys     0m7.067s

time du -sh /data/aa/bb/cc

3.7G         /data/aa/bb/cc

real    16m43.735s

user    0m1.097s

sys     0m5.802s

16m and 7m is too long for a 3.7 G directory. I must mention that the directory contains huge number of files (208736)

but running du on same directory on old data gives this result

time du -sh /olddata/aa/bb/cc

4.0G    /olddata/aa/bb/cc
real    3m1.255s

user    0m0.755s

sys     0m38.099s

much better if I run same command again

 time du -sh /olddata/aa/bb/cc

4.0G    /olddata/aa/bb/cc

real    0m8.309s

user    0m0.313s

sys     0m7.755s

Is there anything I can do to improve this performance? I would also like hear from some one who is running same kind of setup.

Thanks

Kashif 

_______________________________________________

Gluster-users mailing list

Gluster-users@xxxxxxxxxxx

http://lists.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users