Re: Glusterfs performance tweaks

Ben Turner <bturner@xxxxxxxxxx> · Thu, 9 Apr 2015 14:12:08 -0400 (EDT)

----- Original Message -----
> From: "Punit Dambiwal" <hypunit@xxxxxxxxx>
> To: "Vijay Bellur" <vbellur@xxxxxxxxxx>
> Cc: gluster-users@xxxxxxxxxxx
> Sent: Wednesday, April 8, 2015 9:55:38 PM
> Subject: Re:  Glusterfs performance tweaks
> 
> Hi Vijay,
> 
> If i run the same command directly on the brick...
> 
> [root@cpu01 1]# dd if=/dev/zero of=test bs=64k count=4k oflag=dsync
> 4096+0 records in
> 4096+0 records out
> 268435456 bytes (268 MB) copied, 16.8022 s, 16.0 MB/s
> [root@cpu01 1]# pwd
> /bricks/1
> [root@cpu01 1]#
> 

This is your problem.  Gluster is only as fast as its slowest piece, and here your storage is the bottleneck.  Being that you get 16 MB to the brick and 12 to gluster that works out to about 25% overhead which is what I would expect with a single thread, single brick, single client scenario.  This may have something to do with the way SSDs write?  On my SSD at my desk I only get 11.4 MB / sec when I run that DD command:

# dd if=/dev/zero of=test bs=64k count=4k oflag=dsync
4096+0 records in
4096+0 records out
268435456 bytes (268 MB) copied, 23.065 s, 11.4 MB/s

My thought is that maybe using dsync is forcing the SSD to clean the data or something else before writing to it:

http://www.blog.solidstatediskshop.com/2012/how-does-an-ssd-write/

Do your drives support fstrim?  It may be worth it to trim before you run and see what results you get.  Other than tuning the SSD / OS to perform better on the back end there isn't much we can do from the gluster perspective on that specific DD w/ the dsync flag.

-b

> 
> On Wed, Apr 8, 2015 at 6:44 PM, Vijay Bellur < vbellur@xxxxxxxxxx > wrote:
> 
> 
> 
> On 04/08/2015 02:57 PM, Punit Dambiwal wrote:
> 
> 
> 
> Hi,
> 
> I am getting very slow throughput in the glusterfs (dead slow...even
> SATA is better) ... i am using all SSD in my environment.....
> 
> I have the following setup :-
> A. 4* host machine with Centos 7(Glusterfs 3.6.2 | Distributed
> Replicated | replica=2)
> B. Each server has 24 SSD as bricks…(Without HW Raid | JBOD)
> C. Each server has 2 Additional ssd for OS…
> D. Network 2*10G with bonding…(2*E5 CPU and 64GB RAM)
> 
> Note :- Performance/Throughput slower then Normal SATA 7200 RPM…even i
> am using all SSD in my ENV..
> 
> Gluster Volume options :-
> 
> +++++++++++++++
> Options Reconfigured:
> performance.nfs.write-behind- window-size: 1024MB
> performance.io-thread-count: 32
> performance.cache-size: 1024MB
> cluster.quorum-type: auto
> cluster.server-quorum-type: server
> diagnostics.count-fop-hits: on
> diagnostics.latency- measurement: on
> nfs.disable: on
> user.cifs: enable
> auth.allow: *
> performance.quick-read: off
> performance.read-ahead: off
> performance.io-cache: off
> performance.stat-prefetch: off
> cluster.eager-lock: enable
> network.remote-dio: enable
> storage.owner-uid: 36
> storage.owner-gid: 36
> server.allow-insecure: on
> network.ping-timeout: 0
> diagnostics.brick-log-level: INFO
> +++++++++++++++++++
> 
> Test with SATA and Glusterfs SSD….
> ———————
> Dell EQL (SATA disk 7200 RPM)
> —-
> [root@mirror ~]#
> 4096+0 records in
> 4096+0 records out
> 268435456 bytes (268 MB) copied, 20.7763 s, 12.9 MB/s
> [root@mirror ~]# dd if=/dev/zero of=test bs=64k count=4k oflag=dsync
> 4096+0 records in
> 4096+0 records out
> 268435456 bytes (268 MB) copied, 23.5947 s, 11.4 MB/s
> 
> GlsuterFS SSD
> —
> [root@sv-VPN1 ~]# dd if=/dev/zero of=test bs=64k count=4k oflag=dsync
> 4096+0 records in
> 4096+0 records out
> 268435456 bytes (268 MB) copied, 66.2572 s, 4.1 MB/s
> [root@sv-VPN1 ~]# dd if=/dev/zero of=test bs=64k count=4k oflag=dsync
> 4096+0 records in
> 4096+0 records out
> 268435456 bytes (268 MB) copied, 62.6922 s, 4.3 MB/s
> ————————
> 
> Please let me know what i should do to improve the performance of my
> glusterfs…
> 
> 
> What is the throughput that you get when you run these commands on the disks
> directly without gluster in the picture?
> 
> By running dd with dsync you are ensuring that there is no buffering anywhere
> in the stack and that is the reason why low throughput is being observed.
> 
> -Vijay
> 
> -Vijay
> 
> 
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users@xxxxxxxxxxx
> http://www.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users