Re: readdir() scalability (was Re: [RFC ] dictionary optimizations)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Al 12/09/13 13:17, En/na Brian Foster ha escrit:
On 09/12/2013 06:08 AM, Xavier Hernandez wrote:
Al 09/09/13 17:25, En/na Vijay Bellur ha escrit:
On 09/09/2013 02:18 PM, Xavier Hernandez wrote:
Al 06/09/13 20:43, En/na Anand Avati ha escrit:
On Fri, Sep 6, 2013 at 1:46 AM, Xavier Hernandez
<xhernandez@xxxxxxxxxx <mailto:xhernandez@xxxxxxxxxx>> wrote:

     Al 04/09/13 18:10, En/na Anand Avati ha escrit:
     On Wed, Sep 4, 2013 at 6:37 AM, Xavier Hernandez
     <xhernandez@xxxxxxxxxx <mailto:xhernandez@xxxxxxxxxx>> wrote:

         Al 04/09/13 14:05, En/na Jeff Darcy ha escrit:

             On 09/04/2013 04:27 AM, Xavier Hernandez wrote:

...
Have you tried turning on "cluster.readdir-optimize"? This could help
improve readdir performance for the directory hierarchy that you
describe.

I repeated the tests with this option enabled and it really improved
readdir performance, however it still shows a linear speed loss as the
number of bricks increases. Will the readdir-ahead translator be able to
hide this linear effect when the number of bricks is very high ?

I don't know that it will change the overall effect, but perhaps it
could smooth things out (or if not, we can see about further
improvements). Could you try it out and let us know? :)
I've repeated the tests using master branch (commit 643533c7), combining cluster.readdir-optimize and performance.readdir-ahead. These are the results:

Configurations

  Test1: cluster.readdir-optimize=off and performance.readdir-ahead=off
  Test2: cluster.readdir-optimize=on  and performance.readdir-ahead=off
  Test3: cluster.readdir-optimize=off and performance.readdir-ahead=on
  Test4: cluster.readdir-optimize=on  and performance.readdir-ahead=on

ls: average time to complete 3 'ls -lR <mount root> | wc -l'
    (a previous ls is made to fill the caches)
rb: rebalance time (not averaged, only done once)

Bricks    Test1       Test2       Test3       Test4
         ls   rb     ls   rb     ls   rb     ls   rb
   1    10.7  --    10.6  --     9.8  --     9.8  --
   2    18.7  82    14.1  84    17.1  83    13.5  82
   3    24.6  83    16.8  84    23.1  84    16.4  85
   4    30.2  87    19.7  86    29.0  88    19.2  87
   5    36.0  92    22.5  90    34.8  91    21.7  91
   6    42.2  97    25.1  96    40.9  95    24.1  96
  12    80.4 161    42.1 160    81.3 162    41.5 162

It seems that the benefit is minimal when only considering the directory structure.

Xavi

Brian

Results of the tests with cluser.readdir-optimize active:

1 brick: 11.8 seconds
2 bricks: 15.4 seconds
3 bricks: 17.9 seconds
4 bricks: 20.6 seconds
5 bricks: 22.9 seconds
6 bricks: 25.4 seconds
12 bricks: 41.8 seconds

Rebalance also improved:

 From 1 to 2 bricks: 77 seconds
 From 2 to 3 bricks: 78 seconds
 From 3 to 4 bricks: 81 seconds
 From 4 to 5 bricks: 84 seconds
 From 5 to 6 bricks: 87 seconds
 From 6 to 12 bricks: 144 seconds

Xavi

-Vijay


After each test, I added a new brick and started a rebalance. Once the
rebalance was completed, I umounted and stopped the volume and restarted
it again.

The test consisted of 4 'time ls -lR /<testdir> | wc -l'. The first
result was discarded. The result shown below is the mean of the other 3
results.

1 brick: 11.8 seconds
2 bricks: 19.0 seconds
3 bricks: 23.8 seconds
4 bricks: 29.8 seconds
5 bricks: 34.6 seconds
6 bricks: 41.0 seconds
12 bricks (2 bricks on each server): 78.5 seconds

The rebalancing time also grew considerably (these times are the result
of a single rebalance. They might not be very accurate):

  From 1 to 2 bricks: 91 seconds
  From 2 to 3 bricks: 102 seconds
  From 3 to 4 bricks: 119 seconds
  From 4 to 5 bricks: 138 seconds
  From 5 to 6 bricks: 151 seconds
  From 6 to 12 bricks: 259 seconds

The number of disk IOPS didn't exceed 40 in any server in any case. The
network bandwidth didn't go beyond 6 Mbits/s between any pair of servers
and none of them reached 100% core usage.

Xavi

Avati



_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxx
https://lists.nongnu.org/mailman/listinfo/gluster-devel


_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxx
https://lists.nongnu.org/mailman/listinfo/gluster-devel


_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxx
https://lists.nongnu.org/mailman/listinfo/gluster-devel

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxx
https://lists.nongnu.org/mailman/listinfo/gluster-devel

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxx
https://lists.nongnu.org/mailman/listinfo/gluster-devel




[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux