high CPU load on all bricks

driver at megahappy.net (Bryan Whitehead) · Thu, 14 Feb 2013 12:31:58 -0800

Yea, only write to the glusterfs mountpoint. Writing directly to the bricks
is bad and shouldn't be done.

On Thu, Feb 14, 2013 at 11:58 AM, Michael Colonno <mcolonno at stanford.edu>wrote:

> Good place to start: do the bricks have to be clients as well? In other
> words if I copy a file to a Gluster brick without going through a glusterfs
> or NFS mount will that disrupt the parallel file system? I assumed files
> need to be routed through a glusterfs mount point for Gluster to be able to
> track them(?) What's recommended for bricks which also need i/o to the
> entire volume?
>
> Thanks,
> Mike C.
>
> On Feb 14, 2013, at 10:28 AM, harry mangalam <harry.mangalam at uci.edu>
> wrote:
>
> > While I don't understand your 'each brick system also being a client'
> setup -
> > you mean that each gluster brick is a native gluster client as well?
>  And that
> > is where much of your gluster access is coming from?  That seems ..
> suboptimal
> > if that's the setup.  Is there a reason for that setup?
> >
> > We have a distributed-only glusterfs feeding a medium cluster over a
> similar
> > same setup QDR IPoIB with 4 servers with 2 bricks each.  On a fairly busy
> > system (~80MB/s background), I can get about 100-300MB/s writes to the
> gluster
> > fs on a large 1.7GB file.  (With tiny writes, the perf decreases
> > dramatically).
> >
> > Here is my config: (if anyone spies something that I should change to
> increase
> > my perf, please feel free to point out my mistake)
> >
> > gluster:
> > Volume Name: gl
> > Type: Distribute
> > Volume ID: 21f480f7-fc5a-4fd8-a084-3964634a9332
> > Status: Started
> > Number of Bricks: 8
> > Transport-type: tcp,rdma
> > Bricks:
> > Brick1: bs2:/raid1
> > Brick2: bs2:/raid2
> > Brick3: bs3:/raid1
> > Brick4: bs3:/raid2
> > Brick5: bs4:/raid1
> > Brick6: bs4:/raid2
> > Brick7: bs1:/raid1
> > Brick8: bs1:/raid2
> > Options Reconfigured:
> > performance.write-behind-window-size: 1024MB
> > performance.flush-behind: on
> > performance.cache-size: 268435456
> > nfs.disable: on
> > performance.io-cache: on
> > performance.quick-read: on
> > performance.io-thread-count: 64
> > auth.allow: 10.2.*.*,10.1.*.*
> >
> > my RAID6s (via 3ware 9750s) are mounted with the following options
> >
> > /dev/sdc /raid1 xfs rw,noatime,sunit=512,swidth=8192,allocsize=32m 0 0
> > /dev/sdd /raid2 xfs rw,noatime,sunit=512,swidth=7680,allocsize=32m 0 0
> > (and should probably be using 'nobarrier,inode64' as well. - testing
> this now)
> >
> > There are some good refs on prepping XFS fs for max perf here:
> > <http://www.mythtv.org/wiki/Optimizing_Performance#XFS-Specific_Tips>
> > The script at:
> > <http://www.mythtv.org/wiki/Optimizing_Performance#Further_Information>
> > can help to setup the sunit/swidth options.
> > <
> http://www.mysqlperformanceblog.com/2011/12/16/setting-up-xfs-the-simple-
> > edition/>
> > Your ib interfaces should be using large mtus (65536)
> >
> > hjm
> >
> > On Wednesday, February 13, 2013 10:35:12 PM Michael Colonno wrote:
> >>            More data: I got the Infiniband network (QDR) working well
> and
> >> switched my gluster volume to the Infiniband fabric (IPoIB, not RDMA
> since
> >> it doesn't seem to be supported yet for 3.x). The filesystem was
> slightly
> >> faster but still well short of what I would expect by a wide margin.
> Via an
> >> informal test (timing the movement of a large file) I'm getting several
> MB/s
> >> - well short of even a standard Gb network copy. With the faster network
> >> the CPU load on the brick systems increased dramatically: now I'm seeing
> >> 200%-250% usage by glusterfsd and glusterfs.
> >>
> >>            This leads me to believe that gluster is really not enjoying
> my
> >> eight-brick, 2x replication volume with each brick system also being a
> >> client. I tried a rebalance but no measurable effect. Any suggestions
> for
> >> improving the performance? Having each brick be a client of itself
> seemed
> >> the most logical choice to remove interdependencies but now I'm
> doubting the
> >> setup.
> >>
> >>
> >>
> >>            Thanks,
> >>
> >>            ~Mike C.
> >>
> >>
> >>
> >> From: gluster-users-bounces at gluster.org
> >> [mailto:gluster-users-bounces at gluster.org] On Behalf Of Joe Julian
> >> Sent: Sunday, February 03, 2013 11:47 AM
> >> To: gluster-users at gluster.org
> >> Subject: Re: high CPU load on all bricks
> >>
> >>
> >>
> >> On 02/03/2013 11:22 AM, Michael Colonno wrote:
> >>
> >>
> >>
> >>            Having taken a lot more data it does seem the glusterfsd and
> >> glusterd processes (along with several ksoftirqd) spike up to near 100%
> on
> >> both client and brick servers during any file transport across the
> mount.
> >> Thankfully this is short-lived for the most part but I'm wondering if
> this
> >> is expected behavior or what others have experienced(?) I'm a little
> >> surprised such a large CPU load would be required to move small files
> and /
> >> or use an application within a Gluster mount point.
> >>
> >>
> >> If you're getting ksoftirqd spikes, that sounds like a hardware issue
> to me.
> >> I never see huge spikes like that on my servers nor clients.
> >>
> >>
> >>
> >>
> >>
> >>
> >>            I wanted to test this against an NFS mount of the same
> Gluster
> >> volume. I managed to get rstatd installed and running but my attempts to
> >> mount the volume via NFS are met with:
> >>
> >>
> >>
> >>            mount.nfs: requested NFS version or transport protocol is not
> >> supported
> >>
> >>
> >>
> >>            Relevant line in /etc/fstab:
> >>
> >>
> >>
> >>            node1:/volume    /volume    nfs
> >> defaults,_netdev,vers=3,mountproto=tcp        0 0
> >>
> >>
> >>
> >> It looks like CentOS 6.x has NFS version 4 built into everything. So a
> few
> >> questions:
> >>
> >>
> >>
> >> -       Has anyone else noted significant performance differences
> between a
> >> glusterfs mount and NFS mount for volumes of 8+ bricks?
> >>
> >> -       Is there a straightforward way to make the newer versions of
> CentOS
> >> play nice with NFS version 3 + Gluster?
> >>
> >> -       Are there any general performance tuning guidelines I can
> follow to
> >> improve CPU performance? I found a few references to the cache settings
> but
> >> nothing solid.
> >>
> >>
> >>
> >> If the consensus is that NFS will not gain anything then I won't waste
> the
> >> time setting it all up.
> >>
> >>
> >> NFS gains you the use of FSCache to cache directories and file stats
> making
> >> directory listings faster, but it adds overhead decreasing the overall
> >> throughput (from all the reports I've seen).
> >>
> >> I would suspect that you have the kernel nfs server running on your
> servers.
> >> Make sure it's disabled.
> >>
> >>
> >>
> >>
> >>
> >>
> >> Thanks,
> >>
> >> ~Mike C.
> >>
> >>
> >>
> >>
> >>
> >> From: gluster-users-bounces at gluster.org
> >> [mailto:gluster-users-bounces at gluster.org] On Behalf Of Michael Colonno
> >> Sent: Friday, February 01, 2013 4:46 PM
> >> To: gluster-users at gluster.org
> >> Subject: Re: high CPU load on all bricks
> >>
> >>
> >>
> >>            Update: after a few hours the CPU usage seems to have dropped
> >> down to a small value. I did not change anything with respect to the
> >> configuration or unmount / stop anything as I wanted to see if this
> would
> >> persist for a long period of time. Both the client and the self-mounted
> >> bricks are now showing CPU < 1% (as reported by top). Prior to the
> larger
> >> CPU loads I installed a bunch of software into the volume (~ 5 GB
> total). Is
> >> this kind a transient behavior - by which I mean larger CPU loads after
> a
> >> lot of filesystem activity in short time - typical? This is not a
> problem
> >> in my deployment; I just want to know what to expect in the future and
> to
> >> complete this thread for future users. If this is expected behavior we
> can
> >> wrap up this thread. If not then I'll do more digging into the logs on
> the
> >> client and brick sides.
> >>
> >>
> >>
> >>            Thanks,
> >>
> >>            ~Mike C.
> >>
> >>
> >>
> >> From: Joe Julian [mailto:joe at julianfamily.org]
> >> Sent: Friday, February 01, 2013 2:08 PM
> >> To: Michael Colonno; gluster-users at gluster.org
> >> Subject: Re: high CPU load on all bricks
> >>
> >>
> >>
> >> Check the client log(s).
> >>
> >> Michael Colonno <mcolonno at stanford.edu> wrote:
> >>
> >>            Forgot to mention: on a client system (not a brick) the
> >> glusterfs process is consuming ~ 68% CPU continuously. This is a much
> less
> >> powerful desktop system so the CPU load can't be compared 1:1 with the
> >> systems comprising the bricks but still very high. So the issue seems to
> >> exist with both glusterfsd and glusterfs processes.
> >>
> >>
> >>
> >>            Thanks,
> >>
> >>            ~Mike C.
> >>
> >>
> >>
> >> From: gluster-users-bounces at gluster.org
> >> [mailto:gluster-users-bounces at gluster.org] On Behalf Of Michael Colonno
> >> Sent: Friday, February 01, 2013 12:46 PM
> >> To: gluster-users at gluster.org
> >> Subject: high CPU load on all bricks
> >>
> >>
> >>
> >>            Gluster gurus ~
> >>
> >>
> >>
> >>            I've deployed and 8-brick (2x replicate) Gluster 3.3.1
> volume on
> >> CentOS 6.3 with tcp transport. I was able to build, start, mount, and
> use
> >> the volume. On each system contributing a brick, however, my CPU usage
> >> (glusterfsd) is hovering around 20% (virtually zero memory usage
> >> thankfully). These are brand new, fairly beefy servers so 20% CPU load
> is
> >> quite a bit. The deployment is pretty plain with each brick mounting the
> >> volume to itself via a glusterfs mount. I assume this type of CPU usage
> is
> >> atypically high; is there anything I can do to investigate what's
> soaking up
> >> CPU and minimize it? Total usable volume size is only about 22 TB
> (about 45
> >> TB total with 2x replicate).
> >>
> >>
> >>
> >>            Thanks,
> >>
> >>            ~Mike C.
> >>
> >>
> >>
> >>
> >>  _____
> >>
> >>
> >> Gluster-users mailing list
> >> Gluster-users at gluster.org
> >> http://supercolony.gluster.org/mailman/listinfo/gluster-users
> >>
> >>
> >>
> >>
> >>
> >>
> >> _______________________________________________
> >> Gluster-users mailing list
> >> Gluster-users at gluster.org
> >> http://supercolony.gluster.org/mailman/listinfo/gluster-users
> >
> > ---
> > Harry Mangalam - Research Computing, OIT, Rm 225 MSTB, UC Irvine
> > [m/c 2225] / 92697 Google Voice Multiplexer: (949) 478-4487
> > 415 South Circle View Dr, Irvine, CA, 92697 [shipping]
> > MSTB Lat/Long: (33.642025,-117.844414) (paste into Google Maps)
> > ---
> > "Something must be done. [X] is something. Therefore, we must do it."
> > Bruce Schneier, on American response to just about anything.
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130214/62a39fdc/attachment-0001.html>