Re: Sudden, dramatic performance drops with Glusterfs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi  Strahil,

Thanks for the reply. See below.

Also, as an aside, I tested by installing a single Cenots 7 machine with the ZBOD, installed gluster and ZFSonLinux as recommended at..
 
https://staged-gluster-docs.readthedocs.io/en/release3.7.0beta1/Administrator%20Guide/Gluster%20On%20ZFS/

And created a gluster volume consisting of one brick made up of a local ZFS raidz2, copied about 4 TB of data to it, and am having the same issue.

The biggest part of the issue is with things like "ls" and "find". IF I read a single file, or write a single file it works great. But if I run rsync (which does alot of listing, writing, renaming, etc) it is slow as garbage. I.e. a find command that will finish in 30 seconds when run directly on the underlying ZFS directory, takes about an hour.

 
Strahil wrote on 08-Nov-19 05:39:

Hi Michael,

What is your 'gluster volume info <VOL> ' showing.

I've been playing with the install (since it's a fresh machine) so I can't give you verbatim output. However, it was showing two bricks, one on each server, started, and apparently healthy.

How much is your zpool full ? Usually when it gets too full, the ZFS performance drops seriosly.

The zpool is only at about 30% usage. It's a new server setup.
We have about 10TB of data on a 30TB volume (made up of two 30TB ZFS raidz2 bricks, each residing on different servers, via a 10GB dedicated Ethernet connection.)

Try to rsync a file directly to one of the bricks, then to the other brick (don't forget to remove the files after that, as gluster will not know about them).

If I rsync manually, or scp a file directly to the zpool bricks (outside of gluster) I get 30-100MBytes/s (depending on what I'm copying.)
If I rsync THROUGH gluster (via the glusterfs mounts) I get 1 - 5MB/s

What are your mounting options ? Usually 'noatime,nodiratime' are a good start.

I'll try these. Currently using ...
(mounting TO serverA) serverA:/homes /glusterfs/homes    glusterfs defaults,_netdev 0 0

Are you using ZFS provideed by Ubuntu packagees or directly from ZOL project ?

ZFS provided by Ubuntu 18 repo...
  libzfs2linux/bionic-updates,now 0.7.5-1ubuntu16.6 amd64 [installed,automatic]
  zfs-dkms/bionic-updates,bionic-updates,now 0.7.5-1ubuntu16.6 all [installed]
  zfs-zed/bionic-updates,now 0.7.5-1ubuntu16.6 amd64 [installed,automatic]
  zfsutils-linux/bionic-updates,now 0.7.5-1ubuntu16.6 amd64 [installed]

Gluster provided by. "add-apt-repository ppa:gluster/glusterfs-5" ...
  glusterfs 5.10
  Repository revision: git://git.gluster.org/glusterfs.git

 

Best Regards,
Strahil Nikolov

On Nov 6, 2019 12:50, Michael Rightmire <Michael.Rightmire@xxxxxxx> wrote:
Hello list!

I'm new to Glusterfs in general. We have chosen to use it as our distributed file system on a new set of HA file servers.

The setup is:
2 SUPERMICRO SuperStorage Server 6049PE1CR36L with 24-4TB spinning disks and NVMe for cache and slog.
HBA not RAID card
Ubuntu 18.04 server (on both systems)
ZFS filestorage
Glusterfs 5.10

Step one was to install Ubuntu, ZFS, and gluster. This all went without issue.
We have 3 ZFS raidz2 identical on both servers
We have three glusterfs mirrored volumes - 1 attached to each raidz on each server. I.e.

And mounted the gluster volumes as (for example) "/glusterfs/homes -> /zpool/homes". I.e.
gluster volume create homes replica 2 transport tcp server1:/zpool
-homes/homes server2:/zpool-homes/homes force
(on server1) server1:/homes     44729413504 16032705152 28696708352  36% /glusterfs/homes

The problem is, the performance has deteriorated terribly

We needed to copy all of our data from the old server to the new glusterfs volumes (appx. 60TB).
We decided to do this with multiple rsync commands (like 400 simultanous rsyncs)
The copy went well for the first 4 days, with an average across all rsyncs of  150-200 MBytes per second.
Then, suddenly, on the fourth day, it dropped to about 50 MBytes/s.
Then, by the end of the day, down to ~5MBytes/s (five).
I've stopped the rsyncs, and I can still copy an individual file across to the glusterfs shared directory at 100MB/s.
But actions such as "ls -la" or "find" take forever!

Are there obvious flaws in my setup to correct?
How can I better troubleshoot this?

Thanks!
--

Mike

 


--

Mike

 

Karlsruher Institut für Technologie (KIT)

Institut für Anthropomatik und Robotik (IAR)

Hochperformante Humanoide Technologien (H2T)

 

Michael Rightmire 

B.Sci, HPUXCA, MCSE, MCP, VDB, ISCB

Systems IT/Development

 

Adenauerring 2 , Gebäude 50.20, Raum 022

76131 Karlsruhe

 

Telefon: +49 721 608-45032

Fax:     +49 721 608-44077

 

E-Mail:    Michael.Rightmire@xxxxxxx

http://www.humanoids.kit.edu/

http://h2t.anthropomatik.kit.edu

 

KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft

Das KIT ist seit 2010 als familiengerechte Hochschule zertifiziert

 

________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314

Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux