Re: performance

Artem Russakovskii <archon810@xxxxxxxxx> · Mon, 3 Aug 2020 21:42:45 -0700

I tried putting all web files (specifically WordPress php and static files as well as various cache files) on gluster before, and the results were miserable on a busy site - our usual ~8-10 load quickly turned into 100+ and killed everything.
I had to go back to running just the user uploads (which are static files in the Wordpress uploads/ dir) on gluster and using rsync (via lsyncd) for the frequently executed php / cache.

I'd love to figure this out as well and tune gluster for heavy reads and moderate writes, but I haven't cracked that recipe yet. 

On Mon, Aug 3, 2020, 8:08 PM Computerisms Corporation <bob@xxxxxxxxxxxxxxx> wrote:
Hi Gurus,

I have been trying to wrap my head around performance improvements on my 

gluster setup, and I don't seem to be making any progress.  I mean 

forward progress.  making it worse takes practically no effort at all.

My gluster is distributed-replicated across 6 bricks and 2 servers, with 

an arbiter on each server.  I designed it like this so I have an 

expansion path to more servers in the future (like the staggered arbiter 

diagram in the red hat documentation).  gluster v info output is below. 

I have compiled gluster 7.6 from sources on both servers.

Servers are 6core/3.4Ghz with 32 GB RAM, no swap, and SSD and gigabit 

network connections.  They are running debian, and are being used as 

redundant web servers.  There is some 3Million files on the Gluster 

Storage averaging 130KB/file.  Currently only one of the two servers is 

serving web services.  There are well over 100 sites, and apache 

server-status claims around 5 hits per second, depending on time of day, 

so a fair bit of logging going on.  The gluster is only holding website 

data and config files that will be common between the two servers, no 

databases or anything like that on the Gluster.

When the serving server is under load load average is consistently 

12-20.  glusterfs is always at the top with 150%-250% cpu, and each of 3 

bricks at roughly 50-70%, so consistently pegging 4 of the 6 cores. 

apache processes will easily eat up all the rest of the cpus after that. 

  And web page response time is underwhelming at best.

Interestingly, mostly because it is not something I have ever 

experienced before, software interrupts sit between 1 and 5 on each 

core, but the last core is usually sitting around 20.  Have never 

encountered a high load average where the si number was ever 

significant.  I have googled the crap out of that (as well as gluster 

performance in general), there are nearly limitless posts about what it 

is, but have yet to see one thing to explain what to do about it.  Sadly 

I can't really shut down the gluster process to confirm if that is the 

cause, but it's a pretty good bet, I think.

When the system is not under load, glusterfs will be running at around 

100% with each of the 3 bricks around 35%, so using 2 cores when doing 

not much of anything.

nload shows the network cards rarely climb above 300 Mbps unless I am 

doing a direct file transfer between the servers, in which case it gets 

right up to the 1Gbps limit.  RAM is never above 15GB unless I am 

causing it to happen.  atop show a disk busy percentage, it is often 

above 50% and sometimes will hit 100%, and is no where near as 

consistently showing excessive usage like the cpu cores are.  The cpu 

definitely seems to be the bottleneck.

When I found out about the groups directory, I figured one of those must 

be useful to me, but as best as I can tell they are not.  But I am 

really hoping that someone has configured a system like mine and has a 

good group file they might share for this situation, or a peak at their 

volume info output?

or maybe this is really just about as good as I should expect?  Maybe 

the fix is that I need more/faster cores?  I hope not, as that isn't 

really an option.

Anyway, here is my volume info as promised.

root@mooglian:/Computerisms/sites/computerisms.ca/log# gluster v info

Volume Name: webisms

Type: Distributed-Replicate

Volume ID: 261901e7-60b4-4760-897d-0163beed356e

Status: Started

Snapshot Count: 0

Number of Bricks: 2 x (2 + 1) = 6

Transport-type: tcp

Bricks:

Brick1: mooglian:/var/GlusterBrick/replset-0/webisms-replset-0

Brick2: moogle:/var/GlusterBrick/replset-0/webisms-replset-0

Brick3: moogle:/var/GlusterBrick/replset-0-arb/webisms-replset-0-arb 

(arbiter)

Brick4: moogle:/var/GlusterBrick/replset-1/webisms-replset-1

Brick5: mooglian:/var/GlusterBrick/replset-1/webisms-replset-1

Brick6: mooglian:/var/GlusterBrick/replset-1-arb/webisms-replset-1-arb 

(arbiter)

Options Reconfigured:

auth.allow: xxxx

performance.client-io-threads: off

nfs.disable: on

storage.fips-mode-rchecksum: on

transport.address-family: inet

performance.stat-prefetch: on

network.inode-lru-limit: 200000

performance.write-behind-window-size: 4MB

performance.readdir-ahead: on

performance.io-thread-count: 64

performance.cache-size: 8GB

server.event-threads: 8

client.event-threads: 8

performance.nl-cache-timeout: 600

-- 

Bob Miller

Cell: 867-334-7117

Office: 867-633-3760

Office: 867-322-0362

www.computerisms.ca

________

Community Meeting Calendar:

Schedule -

Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC

Bridge: https://bluejeans.com/441850968

Gluster-users mailing list

Gluster-users@xxxxxxxxxxx

https://lists.gluster.org/mailman/listinfo/gluster-users

________

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users