On 02/12/2015 04:52 PM, Pranith Kumar Karampuri wrote:
On 02/12/2015 03:05 PM, Pranith Kumar Karampuri wrote:
On 02/12/2015 09:14 AM, Justin Clift wrote:
On 12 Feb 2015, at 03:02, Shyam <srangana@xxxxxxxxxx> wrote:
On 02/11/2015 08:28 AM, David F. Robinson wrote:
My base filesystem has 40-TB and the tar takes 19 minutes. I
copied over 10-TB and it took the tar extraction from 1-minute to
7-minutes.
My suspicion is that it is related to number of files and not
necessarily file size. Shyam is looking into reproducing this
behavior on a redhat system.
I am able to reproduce the issue on a similar setup internally (at
least at the surface it seems to be similar to what David is facing).
I will continue the investigation for the root cause.
Here is the initial analysis of my investigation: (Thanks for
providing me with the setup shyam, keep the setup we may need it for
further analysis)
On bad volume:
%-latency Avg-latency Min-Latency Max-Latency No. of
calls Fop
--------- ----------- ----------- -----------
------------ ----
0.00 0.00 us 0.00 us 0.00 us 937104 FORGET
0.00 0.00 us 0.00 us 0.00 us 872478 RELEASE
0.00 0.00 us 0.00 us 0.00 us 23668 RELEASEDIR
0.00 41.86 us 23.00 us 86.00 us 92 STAT
0.01 39.40 us 24.00 us 104.00 us 218 STATFS
0.28 55.99 us 43.00 us 1152.00 us 4065 SETXATTR
0.58 56.89 us 25.00 us 4505.00 us 8236 OPENDIR
0.73 26.80 us 11.00 us 257.00 us 22238 FLUSH
0.77 152.83 us 92.00 us 8819.00 us 4065 RMDIR
2.57 62.00 us 21.00 us 409.00 us 33643 WRITE
5.46 199.16 us 108.00 us 469938.00 us 22238 UNLINK
6.70 69.83 us 43.00 us 7777.00 us 77809 LOOKUP
6.97 447.60 us 21.00 us 54875.00 us 12631 READDIRP
7.73 79.42 us 33.00 us 1535.00 us 78909 SETATTR
14.11 2815.00 us 176.00 us 2106305.00 us 4065 MKDIR
54.09 1972.62 us 138.00 us 1520773.00 us 22238 CREATE
On good volume:
%-latency Avg-latency Min-Latency Max-Latency No. of
calls Fop
--------- ----------- ----------- -----------
------------ ----
0.00 0.00 us 0.00 us 0.00 us 58870 FORGET
0.00 0.00 us 0.00 us 0.00 us 66016 RELEASE
0.00 0.00 us 0.00 us 0.00 us 16480 RELEASEDIR
0.00 61.50 us 58.00 us 65.00 us 2 OPEN
0.01 39.56 us 16.00 us 112.00 us 71 STAT
0.02 41.29 us 27.00 us 79.00 us 163 STATFS
0.03 36.06 us 17.00 us 98.00 us 301 FSTAT
0.79 62.38 us 39.00 us 269.00 us 4065 SETXATTR
1.14 242.99 us 25.00 us 28636.00 us 1497 READ
1.54 59.76 us 25.00 us 6325.00 us 8236 OPENDIR
1.70 133.75 us 89.00 us 374.00 us 4065 RMDIR
2.25 32.65 us 15.00 us 265.00 us 22006 FLUSH
3.37 265.05 us 172.00 us 2349.00 us 4065 MKDIR
7.14 68.34 us 21.00 us 21902.00 us 33357 WRITE
11.00 159.68 us 107.00 us 2567.00 us 22003 UNLINK
13.82 200.54 us 133.00 us 21762.00 us 22003 CREATE
17.85 448.85 us 22.00 us 54046.00 us 12697 READDIRP
18.37 76.12 us 45.00 us 294.00 us 77044 LOOKUP
20.95 85.54 us 35.00 us 1404.00 us 78204 SETATTR
As we can see here, FORGET/RELEASE are way more in the brick from
full volume compared to the brick from empty volume. It seems to
suggest that the inode-table on the volume with lots of data is
carrying too many passive inodes in the table which need to be
displaced to create new ones. Need to check if they come in the
fop-path. Need to continue my investigations further, will let you know.
Just to increase confidence performed one more test. Stopped the
volumes and re-started. Now on both the volumes, the numbers are
almost same:
[root@gqac031 gluster-mount]# time rm -rf boost_1_57_0 ; time tar xf
boost_1_57_0.tar.gz
real 1m15.074s
user 0m0.550s
sys 0m4.656s
real 2m46.866s
user 0m5.347s
sys 0m16.047s
[root@gqac031 gluster-mount]# cd /gluster-emptyvol/
[root@gqac031 gluster-emptyvol]# ls
boost_1_57_0.tar.gz
[root@gqac031 gluster-emptyvol]# time tar xf boost_1_57_0.tar.gz
real 2m31.467s
user 0m5.475s
sys 0m15.471s
gqas015.sbu.lab.eng.bos.redhat.com:testvol on /gluster-mount type
fuse.glusterfs (rw,default_permissions,allow_other,max_read=131072)
gqas015.sbu.lab.eng.bos.redhat.com:emotyvol on /gluster-emptyvol type
fuse.glusterfs (rw,default_permissions,allow_other,max_read=131072)
I just checked that inode_link links the inode and calls
inode_table_prune which triggers these inode_forgets as a synchronous
operation in the fop path.
Pranith
Pranith
Pranith
Thanks Shyam. :)
+ Justin
--
GlusterFS - http://www.gluster.org
An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.
My personal twitter: twitter.com/realjustinclift
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel