On 3/19/2018 5:42 AM, Ondrej Valousek wrote:
Removing NFS or NFS Ganesha from the equation, not very impressed on my
own setup either. For the writes it's doing, that's alot of CPU usage
in top. Seems bottle-necked via a single execution core somewhere trying
to facilitate read / writes to the other bricks.
Writes to the gluster FS from within one of the gluster participating
bricks:
[root@nfs01 n]# dd if=/dev/zero of=./some-file.bin
393505+0 records in
393505+0 records out
201474560 bytes (201 MB) copied, 50.034 s, 4.0 MB/s
[root@nfs01 n]#
Top results (10 second average)won't go over 32%:
top - 00:49:38 up 21:39, 2 users, load average: 0.42, 0.24, 0.19
Tasks: 164 total, 1 running, 163 sleeping, 0 stopped, 0 zombie
%Cpu0 : 29.3 us, 24.7 sy, 0.0 ni, 45.1 id, 0.0 wa, 0.0 hi, 0.8 si,
0.0 st
%Cpu1 : 27.2 us, 24.1 sy, 0.0 ni, 47.2 id, 0.0 wa, 0.0 hi, 1.5 si,
0.0 st
%Cpu2 : 20.2 us, 13.5 sy, 0.0 ni, 64.1 id, 0.0 wa, 0.0 hi, 2.3 si,
0.0 st
%Cpu3 : 30.0 us, 16.2 sy, 0.0 ni, 47.5 id, 0.0 wa, 0.0 hi, 6.3 si,
0.0 st
KiB Mem : 3881708 total, 3207488 free, 346680 used, 327540 buff/cache
KiB Swap: 4063228 total, 4062828 free, 400 used. 3232208 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1319 root 20 0 819036 12928 4036 S 32.3 0.3 1:19.64
glusterfs
1310 root 20 0 1232428 25636 4364 S 12.1 0.7 0:41.25
glusterfsd
Next, the same write but directly to the brick via XFS, which of course
is faster:
top - 09:45:09 up 1 day, 6:34, 3 users, load average: 0.61, 1.01, 1.04
Tasks: 171 total, 2 running, 169 sleeping, 0 stopped, 0 zombie
%Cpu0 : 0.6 us, 2.1 sy, 0.0 ni, 82.6 id, 14.5 wa, 0.0 hi, 0.2 si,
0.0 st
%Cpu1 : 16.7 us, 83.3 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si,
0.0 st
%Cpu2 : 0.4 us, 0.9 sy, 0.0 ni, 94.2 id, 4.4 wa, 0.0 hi, 0.0 si,
0.0 st
%Cpu3 : 1.1 us, 0.6 sy, 0.0 ni, 98.3 id, 0.0 wa, 0.0 hi, 0.0 si,
0.0 st
KiB Mem : 3881708 total, 501120 free, 230704 used, 3149884 buff/cache
KiB Swap: 4063228 total, 3876896 free, 186332 used. 3343960 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
14691 root 20 0 107948 608 512 R 25.0 0.0 0:34.29 dd
1334 root 20 0 2694264 61076 2228 S 2.7 1.6 283:55.96
ganesha.nfsd
The result of a dd command directly against the brick FS itself is of
course much better:
[root@nfs01 gv01]# dd if=/dev/zero of=./some-file.bin
5771692+0 records in
5771692+0 records out
2955106304 bytes (3.0 GB) copied, 35.3425 s, 83.6 MB/s
[root@nfs01 gv01]# pwd
/bricks/0/gv01
[root@nfs01 gv01]#
Tried a few tweak options with no effect:
[root@nfs01 glusterfs]# gluster volume info
Volume Name: gv01
Type: Replicate
Volume ID: e5ccc75e-5192-45ac-b410-a34ebd777666
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: nfs01:/bricks/0/gv01
Brick2: nfs02:/bricks/0/gv01
Options Reconfigured:
cluster.server-quorum-type: server
cluster.quorum-type: auto
server.event-threads: 8
client.event-threads: 8
performance.readdir-ahead: on
performance.write-behind-window-size: 8MB
performance.io-thread-count: 16
performance.cache-size: 1GB
nfs.trusted-sync: on
performance.client-io-threads: off
nfs.disable: on
transport.address-family: inet
[root@nfs01 glusterfs]#
That's despite that I can confirm doing 90+ MB/s on my 1Gbe network.
Thoughts?
--
Cheers,
Tom K.
-------------------------------------------------------------------------------------
Living on earth is expensive, but it includes a free trip around the sun.
Hi,
As I posted in my previous emails - glusterfs can never match NFS (especially async one) performance of small files/latency. That's given by the design.
Nothing you can do about it.
Ondrej
-----Original Message-----
From: gluster-users-bounces@xxxxxxxxxxx [mailto:gluster-users-bounces@xxxxxxxxxxx] On Behalf Of Rik Theys
Sent: Monday, March 19, 2018 10:38 AM
To: gluster-users@xxxxxxxxxxx; mailinglists@xxxxxxxxxxx
Subject: Re: Gluster very poor performance when copying small files (1x (2+1) = 3, SSD)
Hi,
I've done some similar tests and experience similar performance issues (see my 'gluster for home directories?' thread on the list).
If I read your mail correctly, you are comparing an NFS mount of the brick disk against a gluster mount (using the fuse client)?
Which options do you have set on the NFS export (sync or async)?
From my tests, I concluded that the issue was not bandwidth but latency.
Gluster will only return an IO operation once all bricks have confirmed that the data is on disk. If you are using a fuse mount, you might compare with using the 'direct-io-mode=disable' option on the client might help (no experience with this).
In our tests, I've used NFS-ganesha to serve the gluster volume over NFS. This makes things even worse as NFS-ganesha has no "async" mode, which makes performance terrible.
If you find a magic knob to make glusterfs fast on small-file workloads, do let me know!
Regards,
Rik
On 03/18/2018 11:13 PM, Sam McLeod wrote:
Howdy all,
We're experiencing terrible small file performance when copying or
moving files on gluster clients.
In the example below, Gluster is taking 6mins~ to copy 128MB / 21,000
files sideways on a client, doing the same thing on NFS (which I know
is a totally different solution etc. etc.) takes approximately 10-15
seconds(!).
Any advice for tuning the volume or XFS settings would be greatly
appreciated.
Hopefully I've included enough relevant information below.
## Gluster Client
root@gluster-client:/mnt/gluster_perf_test/ # du -sh .
127M .
root@gluster-client:/mnt/gluster_perf_test/ # find . -type f | wc -l
21791
root@gluster-client:/mnt/gluster_perf_test/ # du 9584toto9584.txt
4 9584toto9584.txt
root@gluster-client:/mnt/gluster_perf_test/ # time cp -a private
private_perf_test
real 5m51.862s
user 0m0.862s
sys 0m8.334s
root@gluster-client:/mnt/gluster_perf_test/ # time rm -rf
private_perf_test/
real 0m49.702s
user 0m0.087s
sys 0m0.958s
## Hosts
- 16x Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz per Gluster host /
client
- Storage: iSCSI provisioned (via 10Gbit DAC/Fibre), SSD disk, 50K
R/RW 4k IOP/s, 400MB/s per Gluster host
- Volumes are replicated across two hosts and one arbiter only host
- Networking is 10Gbit DAC/Fibre between Gluster hosts and clients
- 18GB DDR4 ECC memory
## Volume Info
root@gluster-host-01:~ # gluster pool list UUID Hostname
State ad02970b-e2aa-4ca8-998c-bd10d5970faa gluster-host-02.fqdn
Connected ea116a94-c19e-48db-b108-0be3ae622e2e gluster-host-03.fqdn
Connected
2e855c25-e7ac-4ff6-be85-e8bcc6f45ee4 localhost Connected
root@gluster-host-01:~ # gluster volume info uat_storage
Volume Name: uat_storage
Type: Replicate
Volume ID: 7918f1c5-5031-47b8-b054-56f6f0c569a2
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: gluster-host-01.fqdn:/mnt/gluster-storage/uat_storage
Brick2: gluster-host-02.fqdn:/mnt/gluster-storage/uat_storage
Brick3: gluster-host-03.fqdn:/mnt/gluster-storage/uat_storage
(arbiter) Options Reconfigured:
performance.rda-cache-limit: 256MB
network.inode-lru-limit: 50000
server.outstanding-rpc-limit: 256
performance.client-io-threads: true
nfs.disable: on
transport.address-family: inet
client.event-threads: 8
cluster.eager-lock: true
cluster.favorite-child-policy: size
cluster.lookup-optimize: true
cluster.readdir-optimize: true
cluster.use-compound-fops: true
diagnostics.brick-log-level: ERROR
diagnostics.client-log-level: ERROR
features.cache-invalidation-timeout: 600
features.cache-invalidation: true
network.ping-timeout: 15
performance.cache-invalidation: true
performance.cache-max-file-size: 6MB
performance.cache-refresh-timeout: 60
performance.cache-size: 1024MB
performance.io <http://performance.io>-thread-count: 16
performance.md-cache-timeout: 600
performance.stat-prefetch: true
performance.write-behind-window-size: 256MB
server.event-threads: 8
transport.listen-backlog: 2048
root@gluster-host-01:~ # xfs_info /dev/mapper/gluster-storage-unlocked
meta-data=/dev/mapper/gluster-storage-unlocked isize=512 agcount=4,
agsize=196607360 blks
= sectsz=512 attr=2, projid32bit=1
= crc=1 finobt=0 spinodes=0 data
= bsize=4096 blocks=786429440, imaxpct=5
= sunit=0 swidth=0 blks naming
=version 2 bsize=8192 ascii-ci=0 ftype=1 log
=internal bsize=4096 blocks=383998, version=2
= sectsz=512 sunit=0 blks,
lazy-count=1 realtime =none extsz=4096 blocks=0,
rtextents=0
--
Sam McLeod (protoporpoise on IRC)
https://smcleod.net
https://twitter.com/s_mcleod
Words are my own opinions and do not necessarily represent those of my
employer or partners.
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users
--
Rik Theys
System Engineer
KU Leuven - Dept. Elektrotechniek (ESAT) Kasteelpark Arenberg 10 bus 2440 - B-3001 Leuven-Heverlee
+32(0)16/32.11.07
----------------------------------------------------------------
<<Any errors in spelling, tact or fact are transmission errors>> _______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users
-----
The information contained in this e-mail and in any attachments is confidential and is designated solely for the attention of the intended recipient(s). If you are not an intended recipient, you must not use, disclose, copy, distribute or retain this e-mail or any part thereof. If you have received this e-mail in error, please notify the sender by return e-mail and delete all copies of this e-mail from your computer system(s). Please direct any additional queries to: communications@xxxxxxxxxxx. Thank You. Silicon and Software Systems Limited (S3 Group). Registered in Ireland no. 378073. Registered Office: South County Business Park, Leopardstown, Dublin 18.
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users
--
Cheers,
Tom K.
-------------------------------------------------------------------------------------
Living on earth is expensive, but it includes a free trip around the sun.
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users