Re: Very slow performance on Sharded GlusterFS

<gencer@xxxxxxxxxxxxx> · Fri, 30 Jun 2017 16:50:47 +0300

I already tried 512MB but re-try again now and results are the same. Both without tuning;

Stripe 2 replica 2: dd performs 250~ mb/s but shard gives 77mb.

I attached two logs (shard and stripe logs)

Note: I also noticed that you said “order”. Do you mean when we create via volume set we have to make an order for bricks? I thought gluster handles (and  do the math) itself.

Gencer

From: Krutika Dhananjay [mailto:kdhananj@xxxxxxxxxx] 
Sent: Friday, June 30, 2017 3:50 PM
To: gencer@xxxxxxxxxxxxx
Cc: gluster-user <gluster-users@xxxxxxxxxxx>
Subject: Re:  Very slow performance on Sharded GlusterFS

Just noticed that the way you have configured your brick order during volume-create makes both replicas of every set reside on the same machine.
That apart, do you see any difference if you change shard-block-size to 512MB? Could you try that?
If it doesn't help, could you share the volume-profile output for both the tests (separate)?
Here's what you do:
1. Start profile before starting your test - it could be dd or it could be file download.
# gluster volume profile <VOL> start
2. Run your test - again either dd or file-download.
3. Once the test has completed, run `gluster volume profile <VOL> info` and redirect its output to a tmp file.
4. Stop profile
# gluster volume profile <VOL> stop
And attach the volume-profile output file that you saved at a temporary location in step 3.
-Krutika

On Fri, Jun 30, 2017 at 5:33 PM, <gencer@xxxxxxxxxxxxx> wrote:
Hi Krutika,

Sure, here is volume info:

root@sr-09-loc-50-14-18:/# gluster volume info testvol

Volume Name: testvol
Type: Distributed-Replicate
Volume ID: 30426017-59d5-4091-b6bc-279a905b704a
Status: Started
Snapshot Count: 0
Number of Bricks: 10 x 2 = 20
Transport-type: tcp
Bricks:
Brick1: sr-09-loc-50-14-18:/bricks/brick1
Brick2: sr-09-loc-50-14-18:/bricks/brick2
Brick3: sr-09-loc-50-14-18:/bricks/brick3
Brick4: sr-09-loc-50-14-18:/bricks/brick4
Brick5: sr-09-loc-50-14-18:/bricks/brick5
Brick6: sr-09-loc-50-14-18:/bricks/brick6
Brick7: sr-09-loc-50-14-18:/bricks/brick7
Brick8: sr-09-loc-50-14-18:/bricks/brick8
Brick9: sr-09-loc-50-14-18:/bricks/brick9
Brick10: sr-09-loc-50-14-18:/bricks/brick10
Brick11: sr-10-loc-50-14-18:/bricks/brick1
Brick12: sr-10-loc-50-14-18:/bricks/brick2
Brick13: sr-10-loc-50-14-18:/bricks/brick3
Brick14: sr-10-loc-50-14-18:/bricks/brick4
Brick15: sr-10-loc-50-14-18:/bricks/brick5
Brick16: sr-10-loc-50-14-18:/bricks/brick6
Brick17: sr-10-loc-50-14-18:/bricks/brick7
Brick18: sr-10-loc-50-14-18:/bricks/brick8
Brick19: sr-10-loc-50-14-18:/bricks/brick9
Brick20: sr-10-loc-50-14-18:/bricks/brick10
Options Reconfigured:
features.shard-block-size: 32MB
features.shard: on
transport.address-family: inet
nfs.disable: on

-Gencer.

From: Krutika Dhananjay [mailto:kdhananj@xxxxxxxxxx] 
Sent: Friday, June 30, 2017 2:50 PM
To: gencer@xxxxxxxxxxxxx
Cc: gluster-user <gluster-users@xxxxxxxxxxx>
Subject: Re:  Very slow performance on Sharded GlusterFS

Could you please provide the volume-info output?
-Krutika

On Fri, Jun 30, 2017 at 4:23 PM, <gencer@xxxxxxxxxxxxx> wrote:
Hi,

I have an 2 nodes with 20 bricks in total (10+10).

First test: 

2 Nodes with Distributed – Striped – Replicated (2 x 2)
10GbE Speed between nodes

“dd” performance: 400mb/s and higher
Downloading a large file from internet and directly to the gluster: 250-300mb/s

Now same test without Stripe but with sharding. This results are same when I set shard size 4MB or 32MB. (Again 2x Replica here)

Dd performance: 70mb/s
Download directly to the gluster performance : 60mb/s

Now, If we do this test twice at the same time (two dd or two doewnload at the same time) it goes below 25/mb each or slower.

I thought sharding is at least equal or a little slower (maybe?) but these results are terribly slow.

I tried tuning (cache, window-size etc..). Nothing helps.

GlusterFS 3.11 and Debian 9 used. Kernel also tuned. Disks are “xfs” and 4TB each.

Is there any tweak/tuning out there to make it fast?

Or is this an expected behavior? If its, It is unacceptable. So slow. I cannot use this on production as it is terribly slow. 

The reason behind I use shard instead of stripe is i would like to eleminate files that bigger than brick size.

Thanks,
Gencer.

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users

Attachment:
shard.log

Description: Binary data
Attachment:
stripe.log

Description: Binary data
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users