Re: Slow write times to gluster disk

Soumya Koduri <skoduri@xxxxxxxxxx> · Mon, 17 Apr 2017 12:48:41 +0530

On 04/14/2017 10:27 AM, Ravishankar N wrote:
I'm not sure if the version you are running (glusterfs 3.7.11 ) works
with NFS-Ganesha as the link seems to suggest version >=3.8 as a
per-requisite. Adding Soumya for help. If it is not supported, then you
might have to go the plain glusterNFS way.

Even gluster 3.7.x shall work with NFS-Ganesha but the steps to 
configure had changed from 3.8 and hence the pre-requisite was added in 
the doc. IIUC, from your below mail, you would like to try NFS 
(preferably gNFS but not NFS-Ganesha) which may perform better compared 
to fuse mount. In that case, gNFS server comes up by default (till 
release-3.7.x) and there are additional steps needed to export volume 
via gNFS. Let me know if you have any issues accessing volumes via gNFS.

Regards,
Soumya

Regards,
Ravi

On 04/14/2017 03:48 AM, Pat Haley wrote:

Hi Ravi (and list),

We are planning on testing the NFS route to see what kind of speed-up
we get.  A little research led us to the following:

https://gluster.readthedocs.io/en/latest/Administrator%20Guide/NFS-Ganesha%20GlusterFS%20Integration/

Is this correct path to take to mount 2 xfs volumes as a single
gluster file system volume?  If not, what would be a better path?

Pat

On 04/11/2017 12:21 AM, Ravishankar N wrote:
On 04/11/2017 12:42 AM, Pat Haley wrote:

Hi Ravi,

Thanks for the reply.  And yes, we are using the gluster native
(fuse) mount.  Since this is not my area of expertise I have a few
questions (mostly clarifications)

Is a factor of 20 slow-down typical when compare a fuse-mounted
filesytem versus an NFS-mounted filesystem or should we also be
looking for additional issues?  (Note the first dd test described
below was run on the server that hosts the file-systems so no
network communication was involved).

Though both the gluster bricks and the mounts are on the same
physical machine in your setup, the I/O still passes through
different layers of kernel/user-space fuse stack although I don't
know if 20x slow down on gluster vs NFS share is normal. Why don't
you try doing a gluster NFS mount on the machine and try the dd test
and compare it with the gluster fuse mount results?

You also mention tweaking " write-behind xlator settings".  Would
you expect better speed improvements from switching the mounting
from fuse to gnfs or from tweaking the settings?  Also are these
mutually exclusive or would the be additional benefits from both
switching to gfns and tweaking?
You should test these out and find the answers yourself. :-)

My next question is to make sure I'm clear on the comment " if the
gluster node containing the gnfs server goes down, all mounts done
using that node will fail".  If you have 2 servers, each 1 brick in
the over-all gluster FS, and one server fails, then for gnfs nothing
on either server is visible to other nodes while under fuse only the
files on the dead server are not visible.  Is this what you meant?
Yes, for gnfs mounts, all I/O from various mounts go to the gnfs
server process (on the machine whose IP was used at the time of
mounting) which then sends the I/O to the brick processes. For fuse,
the gluster fuse mount itself talks directly to the bricks.

Finally, you mention "even for gnfs mounts, you can achieve
fail-over by using CTDB".  Do you know if CTDB would have any
performance impact (i.e. in a worst cast scenario could adding CTDB
to gnfs erase the speed benefits of going to gnfs in the first place)?
I don't think it would. You can even achieve load balancing via CTDB
to use different gnfs servers for different clients. But I don't know
if this is needed/ helpful in your current setup where everything
(bricks and clients) seem to be on just one server.

-Ravi
Thanks

Pat

On 04/08/2017 12:58 AM, Ravishankar N wrote:
Hi Pat,

I'm assuming you are using gluster native (fuse mount). If it
helps, you could try mounting it via gluster NFS (gnfs) and then
see if there is an improvement in speed. Fuse mounts are slower
than gnfs mounts but you get the benefit of avoiding a single point
of failure. Unlike fuse mounts, if the gluster node containing the
gnfs server goes down, all mounts done using that node will fail).
For fuse mounts, you could try tweaking the write-behind xlator
settings to see if it helps. See the performance.write-behind and
performance.write-behind-window-size options in `gluster volume set
help`. Of course, even for gnfs mounts, you can achieve fail-over
by using CTDB.

Thanks,
Ravi

On 04/08/2017 12:07 AM, Pat Haley wrote:

Hi,

We noticed a dramatic slowness when writing to a gluster disk when
compared to writing to an NFS disk. Specifically when using dd
(data duplicator) to write a 4.3 GB file of zeros:

  * on NFS disk (/home): 9.5 Gb/s
  * on gluster disk (/gdata): 508 Mb/s

The gluser disk is 2 bricks joined together, no replication or
anything else. The hardware is (literally) the same:

  * one server with 70 hard disks  and a hardware RAID card.
  * 4 disks in a RAID-6 group (the NFS disk)
  * 32 disks in a RAID-6 group (the max allowed by the card,
    /mnt/brick1)
  * 32 disks in another RAID-6 group (/mnt/brick2)
  * 2 hot spare

Some additional information and more tests results (after changing
the log level):

glusterfs 3.7.11 built on Apr 27 2016 14:09:22
CentOS release 6.8 (Final)
RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS-3 3108
[Invader] (rev 02)

*Create the file to /gdata (gluster)*
[root@mseas-data2 gdata]# dd if=/dev/zero of=/gdata/zero1 bs=1M
count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 1.91876 s, *546 MB/s*

*Create the file to /home (ext4)*
[root@mseas-data2 gdata]# dd if=/dev/zero of=/home/zero1 bs=1M
count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 0.686021 s, *1.5 GB/s - *3 times
as fast*

Copy from /gdata to /gdata (gluster to gluster)
*[root@mseas-data2 gdata]# dd if=/gdata/zero1 of=/gdata/zero2
2048000+0 records in
2048000+0 records out
1048576000 bytes (1.0 GB) copied, 101.052 s, *10.4 MB/s* -
realllyyy slooowww

*Copy from /gdata to /gdata* *2nd time *(gluster to gluster)**
[root@mseas-data2 gdata]# dd if=/gdata/zero1 of=/gdata/zero2
2048000+0 records in
2048000+0 records out
1048576000 bytes (1.0 GB) copied, 92.4904 s, *11.3 MB/s* -
realllyyy slooowww again

*Copy from /home to /home (ext4 to ext4)*
[root@mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero2
2048000+0 records in
2048000+0 records out
1048576000 bytes (1.0 GB) copied, 3.53263 s, *297 MB/s *30 times
as fast

*Copy from /home to /home (ext4 to ext4)*
[root@mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero3
2048000+0 records in
2048000+0 records out
1048576000 bytes (1.0 GB) copied, 4.1737 s, *251 MB/s* - 30 times
as fast

As a test, can we copy data directly to the xfs mountpoint
(/mnt/brick1) and bypass gluster?

Any help you could give us would be appreciated.

Thanks

--

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley                          Email:  phaley@xxxxxxx
Center for Ocean Engineering       Phone:  (617) 253-6824
Dept. of Mechanical Engineering    Fax:    (617) 253-8125
MIT, Room 5-213                    http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA  02139-4301

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users

--

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley                          Email:  phaley@xxxxxxx
Center for Ocean Engineering       Phone:  (617) 253-6824
Dept. of Mechanical Engineering    Fax:    (617) 253-8125
MIT, Room 5-213                    http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA  02139-4301

--

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley                          Email:  phaley@xxxxxxx
Center for Ocean Engineering       Phone:  (617) 253-6824
Dept. of Mechanical Engineering    Fax:    (617) 253-8125
MIT, Room 5-213                    http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA  02139-4301

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users