Re: High I/O And Processor Utilization

Krutika Dhananjay <kdhananj@xxxxxxxxxx> · Mon, 18 Jan 2016 07:08:37 -0500 (EST)

From: "Kyle Harris" <kyle.harris98@xxxxxxxxx>
To: "Pranith Kumar Karampuri" <pkarampu@xxxxxxxxxx>
Cc: "Lindsay Mathieson" <lindsay.mathieson@xxxxxxxxx>, "Krutika Dhananjay" <kdhananj@xxxxxxxxxx>, gluster-users@xxxxxxxxxxx, "Paul Cuzner" <pcuzner@xxxxxxxxxx>
Sent: Friday, January 15, 2016 4:46:43 AM
Subject: Re: [Gluster-users] High I/O And Processor Utilization

I thought I might take a minute to update everyone on this
situation.  I rebuilt the glusterFS using
a shard size of 256MB and then imported all VMs back on to the cluster.  I rebuilt it from scratch rather than just
doing an export/import on the data so I could start everything fresh.  I wish now I would have used 512MB but
unfortunately I didn’t see that suggestion in time.  Anyway, the
good news is the system load has greatly decreased.  The systems are all now in a usable state.
The bad news is that I am still seeing a bunch of heals and
not sure why.  Because of that, I am also
still seeing the drives slow down from over 110 MB/sec without Gluster running on
them to ~ 25 MB/sec with bricks running on the drives.  So it seems to me there is still an issue
here.

Kyle,
Could you share
1) the output of `gluster volume info <VOLNAME>`
2) Could you share the client/mount logs and also the glustershd.log files?

Also as fate would have it, I am having issues due to a bug
in the Gluster NFS implementation on this version (3.7) that necessitates the
need to set nfs.acl to off so also hoping for a fix for that soon.  I think this is the bug but not sure:  https://bugzilla.redhat.com/show_bug.cgi?id=1238318

CC'ing Niels and Soumya wrt the NFS related issue.

-Krutika

So to sum up, things are working but the performance leaves
much to be desired due mostly I suspect due to all the heals taking place.

- Kyle

On Mon, Jan 11, 2016 at 9:32 PM, Pranith Kumar Karampuri <pkarampu@xxxxxxxxxx> wrote:

On 01/12/2016 08:52 AM, Lindsay Mathieson wrote:

On 11/01/16 15:37, Krutika Dhananjay wrote:

Kyle,

Based on the testing we have done from our end, we've found that 512MB is a good number that is neither too big nor too small,

and provides good performance both on the IO side and with respect to self-heal.

Hi Krutika, I experimented a lot with different chunk sizes, didn't find all that much difference between 4MB and 1GB

But benchmarks are tricky things - I used Crystal Diskmark inside a VM, which is probably not the best assessment. And two of the bricks on my replica 3 are very slow, just test drives, not production. So I guess that would effevt things :)

These are my current setting - what do you use?

Volume Name: datastore1

Type: Replicate

Volume ID: 1261175d-64e1-48b1-9158-c32802cc09f0

Status: Started

Number of Bricks: 1 x 3 = 3

Transport-type: tcp

Bricks:

Brick1: vnb.proxmox.softlog:/vmdata/datastore1

Brick2: vng.proxmox.softlog:/vmdata/datastore1

Brick3: vna.proxmox.softlog:/vmdata/datastore1

Options Reconfigured:

network.remote-dio: enable

cluster.eager-lock: enable

performance.io-cache: off

performance.read-ahead: off

performance.quick-read: off

performance.stat-prefetch: off

performance.strict-write-ordering: on

performance.write-behind: off

nfs.enable-ino32: off

nfs.addr-namelookup: off

nfs.disable: on

performance.cache-refresh-timeout: 4

performance.io-thread-count: 32

cluster.server-quorum-type: server

cluster.quorum-type: auto

client.event-threads: 4

server.event-threads: 4

cluster.self-heal-window-size: 256

features.shard-block-size: 512MB

features.shard: on

performance.readdir-ahead: off

Most of these tests are done by Paul Cuzner (CCed).

Pranith

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users