Re: [ovirt-users] Very poor GlusterFS performance

mabi <mabi@xxxxxxxxxxxxx> · Tue, 20 Jun 2017 14:00:42 -0400

Dear Krutika,

Sorry for asking so naively but can you tell me on what factor do you base that the client and server event-threads parameters for a volume should be set to 4?

Is this metric for example based on the number of cores a GlusterFS server has?

I am asking because I saw my GlusterFS volumes are set to 2 and would like to set these  parameters to something meaningful for performance tuning. My setup is a two node replica with GlusterFS 3.8.11.

Best regards,
M.

-------- Original Message --------
Subject: Re: [Gluster-users] [ovirt-users]  Very poor GlusterFS performance
Local Time: June 20, 2017 12:23 PM
UTC Time: June 20, 2017 10:23 AM
From: kdhananj@xxxxxxxxxx
To: Lindsay Mathieson <lindsay.mathieson@xxxxxxxxx>
gluster-users <gluster-users@xxxxxxxxxxx>, oVirt users <users@xxxxxxxxx>

Couple of things:
1. Like Darrell suggested, you should enable stat-prefetch and increase client and server event threads to 4.
# gluster volume set <VOL> performance.stat-prefetch on
# gluster volume set <VOL> client.event-threads 4
# gluster volume set <VOL> server.event-threads 4

2. Also glusterfs-3.10.1 and above has a shard performance bug fix - https://review.gluster.org/#/c/16966/

With these two changes, we saw great improvement in performance in our internal testing.

Do you mind trying these two options above?
-Krutika

On Tue, Jun 20, 2017 at 1:00 PM, Lindsay Mathieson <lindsay.mathieson@xxxxxxxxx> wrote:
Have you tried with:

performance.strict-o-direct : off
performance.strict-write-ordering : off
They can be changed dynamically.

On 20 June 2017 at 17:21, Sahina Bose <sabose@xxxxxxxxxx> wrote:
[Adding gluster-users]

On Mon, Jun 19, 2017 at 8:16 PM, Chris Boot <bootc@xxxxxxxxx> wrote:
Hi folks,

 I have 3x servers in a "hyper-converged" oVirt 4.1.2 + GlusterFS 3.10
 configuration. My VMs run off a replica 3 arbiter 1 volume comprised of
 6 bricks, which themselves live on two SSDs in each of the servers (one
 brick per SSD). The bricks are XFS on LVM thin volumes straight onto the
 SSDs. Connectivity is 10G Ethernet.

 Performance within the VMs is pretty terrible. I experience very low
 throughput and random IO is really bad: it feels like a latency issue.
 On my oVirt nodes the SSDs are not generally very busy. The 10G network
 seems to run without errors (iperf3 gives bandwidth measurements of >=
 9.20 Gbits/sec between the three servers).

 To put this into perspective: I was getting better behaviour from NFS4
 on a gigabit connection than I am with GlusterFS on 10G: that doesn't
 feel right at all.

 My volume configuration looks like this:

 Volume Name: vmssd
 Type: Distributed-Replicate
 Volume ID: d5a5ddd1-a140-4e0d-b514-701cfe464853
 Status: Started
 Snapshot Count: 0
 Number of Bricks: 2 x (2 + 1) = 6
 Transport-type: tcp
 Bricks:
 Brick1: ovirt3:/gluster/ssd0_vmssd/brick
 Brick2: ovirt1:/gluster/ssd0_vmssd/brick
 Brick3: ovirt2:/gluster/ssd0_vmssd/brick (arbiter)
 Brick4: ovirt3:/gluster/ssd1_vmssd/brick
 Brick5: ovirt1:/gluster/ssd1_vmssd/brick
 Brick6: ovirt2:/gluster/ssd1_vmssd/brick (arbiter)
 Options Reconfigured:
 nfs.disable: on
 transport.address-family: inet6
 performance.quick-read: off
 performance.read-ahead: off
 performance.io-cache: off
 performance.stat-prefetch: off
 performance.low-prio-threads: 32
 network.remote-dio: off
 cluster.eager-lock: enable
 cluster.quorum-type: auto
 cluster.server-quorum-type: server
 cluster.data-self-heal-algorithm: full
 cluster.locking-scheme: granular
 cluster.shd-max-threads: 8
 cluster.shd-wait-qlength: 10000
 features.shard: on
 user.cifs: off
 storage.owner-uid: 36
 storage.owner-gid: 36
 features.shard-block-size: 128MB
 performance.strict-o-direct: on
 network.ping-timeout: 30
 cluster.granular-entry-heal: enable

 I would really appreciate some guidance on this to try to improve things
 because at this rate I will need to reconsider using GlusterFS altogether.

Could you provide the gluster volume profile output while you're running your I/O tests.
# gluster volume profile <volname> start 
to start profiling
# gluster volume profile <volname> info
for the profile output.

Cheers,
 Chris

 --
 Chris Boot
 bootc@xxxxxxxxx
 _______________________________________________
 Users mailing list
 Users@xxxxxxxxx
 http://lists.ovirt.org/mailman/listinfo/users

_______________________________________________
 Gluster-users mailing list
 Gluster-users@xxxxxxxxxxx
 http://lists.gluster.org/mailman/listinfo/gluster-users

-- 
Lindsay

_______________________________________________
 Users mailing list
 Users@xxxxxxxxx
 http://lists.ovirt.org/mailman/listinfo/users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users