Re: Gluster native mount is really slow compared to nfs

Jo Goossens <jo.goossens@xxxxxxxxxxxxxxxx> · Tue, 11 Jul 2017 17:14:51 +0200

Title: RE: [Gluster-users] Gluster native mount is really slow compared to nfs

Hello Joe,

I really appreciate your feedback, but I already tried the opcache stuff (to not valildate at all). It improves of course then, but not completely somehow. Still quite slow.

I did not try the mount options yet, but I will now!

With nfs (doesnt matter much built-in version 3 or ganesha version 4) I can even host the site perfectly fast without these extreme opcache settings.

I still can't understand why the nfs mount is easily 80 times faster, actually no matter what options I set it seems. It's almost there is something really wrong somehow...

I tried the ceph mount now and out of the box it's comparable with gluster with nfs mount.

Regards
Jo

BE: +32 53 599 000
NL: +31 85 888 4 555

https://www.hosted-power.com/

-----Original message-----
From:	Joe Julian <joe@xxxxxxxxxxxxxxxx>
Sent:	Tue 11-07-2017 17:04
Subject:	Re: [Gluster-users] Gluster native mount is really slow compared to nfs
To:	gluster-users@xxxxxxxxxxx; 
                My standard response to someone needing filesystem performance       for www traffic is generally, "you're doing it wrong". https://joejulian.name/blog/optimizing-web-performance-with-glusterfs/

       That said, you might also look at these mount options:       attribute-timeout, entry-timeout, negative-timeout (set to some       large amount of time), and fopen-keep-cache.

     On 07/11/2017 07:48 AM, Jo Goossens       wrote:
                                        Hello,

       Here is the volume info as requested by soumya:

       #gluster volume info www

       Volume Name: www
       Type: Replicate
       Volume ID: 5d64ee36-828a-41fa-adbf-75718b954aff
       Status: Started
       Snapshot Count: 0
       Number of Bricks: 1 x 3 = 3
       Transport-type: tcp
       Bricks:
       Brick1: 192.168.140.41:/gluster/www
       Brick2: 192.168.140.42:/gluster/www
       Brick3: 192.168.140.43:/gluster/www
       Options Reconfigured:
       cluster.read-hash-mode: 0
       performance.quick-read: on
       performance.write-behind-window-size: 4MB
       server.allow-insecure: on
       performance.read-ahead: disable
       performance.readdir-ahead: on
       performance.io-thread-count: 64
       performance.io-cache: on
       performance.client-io-threads: on
       server.outstanding-rpc-limit: 128
       server.event-threads: 3
       client.event-threads: 3
       performance.cache-size: 32MB
       transport.address-family: inet
       nfs.disable: on
       nfs.addr-namelookup: off
       nfs.export-volumes: on
       nfs.rpc-auth-allow: 192.168.140.*
       features.cache-invalidation: on
       features.cache-invalidation-timeout: 600
       performance.stat-prefetch: on
       performance.cache-samba-metadata: on
       performance.cache-invalidation: on
       performance.md-cache-timeout: 600
       network.inode-lru-limit: 100000
       performance.parallel-readdir: on
       performance.cache-refresh-timeout: 60
       performance.rda-cache-limit: 50MB
       cluster.nufa: on
       network.ping-timeout: 5
       cluster.lookup-optimize: on
       cluster.quorum-type: auto

       I started with none of them set and I added/changed while         testing. But it was always slow, by tuning some kernel         parameters it improved slightly (just a few percent, nothing         reasonable)

       I also tried ceph just to compare, I got this with default         settings and no tweaks:

        ./smallfile_cli.py  --top /var/www/test --host-set           192.168.140.41 --threads 8 --files 5000 --file-size 64           --record-size 64
         smallfile version 3.0
                                    hosts in test :           ['192.168.140.41']
                            top test directory(s) :           ['/var/www/test']
                                        operation : cleanup
                                     files/thread : 5000
                                          threads : 8
                    record size (KB, 0 = maximum) : 64
                                   file size (KB) : 64
                           file size distribution : fixed
                                    files per dir : 100
                                     dirs per dir : 10
                       threads share directories? : N
                                  filename prefix :
                                  filename suffix :
                      hash file number into dir.? : N
                              fsync after modify? : N
                   pause between files (microsec) : 0
                             finish all requests? : Y
                                       stonewall? : Y
                          measure response times? : N
                                     verify read? : Y
                                         verbose? : False
                                   log to stderr? : False
                                    ext.attr.size : 0
                                   ext.attr.count : 0
                        permute host directories? : N
                         remote program directory :           /root/smallfile-master
                        network thread sync. dir. :           /var/www/test/network_shared
         starting all threads by creating starting gate file           /var/www/test/network_shared/starting_gate.tmp
         host = 192.168.140.41,thr = 00,elapsed = 1.339621,files =           5000,records = 0,status = ok
         host = 192.168.140.41,thr = 01,elapsed = 1.436776,files =           5000,records = 0,status = ok
         host = 192.168.140.41,thr = 02,elapsed = 1.498681,files =           5000,records = 0,status = ok
         host = 192.168.140.41,thr = 03,elapsed = 1.483886,files =           5000,records = 0,status = ok
         host = 192.168.140.41,thr = 04,elapsed = 1.454833,files =           5000,records = 0,status = ok
         host = 192.168.140.41,thr = 05,elapsed = 1.469340,files =           5000,records = 0,status = ok
         host = 192.168.140.41,thr = 06,elapsed = 1.439060,files =           5000,records = 0,status = ok
         host = 192.168.140.41,thr = 07,elapsed = 1.375074,files =           5000,records = 0,status = ok
         total threads = 8
         total files = 40000
         100.00% of requested files processed, minimum is  70.00
         1.498681 sec elapsed time
         26690.134975 files/sec

         Regards
       Jo

       -----Original message-----
         From: Jo Goossens         <jo.goossens@xxxxxxxxxxxxxxxx>
         Sent: Tue 11-07-2017 12:15
         Subject: Re: [Gluster-users] Gluster native         mount is really slow compared to nfs
         To: Soumya Koduri <skoduri@xxxxxxxxxx>;         gluster-users@xxxxxxxxxxx; 
         CC: Ambarish Soman <asoman@xxxxxxxxxx>; 
                           Hello,

           Here is some speedtest with a new setup we just made with             gluster 3.10, there are no other differences, except             glusterfs versus nfs. The nfs is about 80 times faster:

           root@app1:~/smallfile-master# mount -t glusterfs -o             use-readdirp=no,log-level=WARNING,log-file=/var/log/glusterxxx.log             192.168.140.41:/www /var/www
           root@app1:~/smallfile-master# ./smallfile_cli.py  --top             /var/www/test --host-set 192.168.140.41 --threads 8 --files             500 --file-size 64 --record-size 64
           smallfile version 3.0
                                      hosts in test :             ['192.168.140.41']
                              top test directory(s) :             ['/var/www/test']
                                          operation : cleanup
                                       files/thread : 500
                                            threads : 8
                      record size (KB, 0 = maximum) : 64
                                     file size (KB) : 64
                             file size distribution : fixed
                                      files per dir : 100
                                       dirs per dir : 10
                         threads share directories? : N
                                    filename prefix :
                                    filename suffix :
                        hash file number into dir.? : N
                                fsync after modify? : N
                     pause between files (microsec) : 0
                               finish all requests? : Y
                                         stonewall? : Y
                            measure response times? : N
                                       verify read? : Y
                                           verbose? : False
                                     log to stderr? : False
                                      ext.attr.size : 0
                                     ext.attr.count : 0
                          permute host directories? : N
                           remote program directory :             /root/smallfile-master
                          network thread sync. dir. :             /var/www/test/network_shared
           starting all threads by creating starting gate file             /var/www/test/network_shared/starting_gate.tmp
           host = 192.168.140.41,thr = 00,elapsed = 68.845450,files             = 500,records = 0,status = ok
           host = 192.168.140.41,thr = 01,elapsed = 67.601088,files             = 500,records = 0,status = ok
           host = 192.168.140.41,thr = 02,elapsed = 58.677994,files             = 500,records = 0,status = ok
           host = 192.168.140.41,thr = 03,elapsed = 65.901922,files             = 500,records = 0,status = ok
           host = 192.168.140.41,thr = 04,elapsed = 66.971720,files             = 500,records = 0,status = ok
           host = 192.168.140.41,thr = 05,elapsed = 71.245102,files             = 500,records = 0,status = ok
           host = 192.168.140.41,thr = 06,elapsed = 67.574845,files             = 500,records = 0,status = ok
           host = 192.168.140.41,thr = 07,elapsed = 54.263242,files             = 500,records = 0,status = ok
           total threads = 8
           total files = 4000
           100.00% of requested files processed, minimum is  70.00
           71.245102 sec elapsed time
           56.144211 files/sec

           umount /var/www

           root@app1:~/smallfile-master# mount -t nfs -o tcp             192.168.140.41:/www /var/www
           root@app1:~/smallfile-master# ./smallfile_cli.py  --top             /var/www/test --host-set 192.168.140.41 --threads 8 --files             500 --file-size 64 --record-size 64
           smallfile version 3.0
                                      hosts in test :             ['192.168.140.41']
                              top test directory(s) :             ['/var/www/test']
                                          operation : cleanup
                                       files/thread : 500
                                            threads : 8
                      record size (KB, 0 = maximum) : 64
                                     file size (KB) : 64
                             file size distribution : fixed
                                      files per dir : 100
                                       dirs per dir : 10
                         threads share directories? : N
                                    filename prefix :
                                    filename suffix :
                        hash file number into dir.? : N
                                fsync after modify? : N
                     pause between files (microsec) : 0
                               finish all requests? : Y
                                         stonewall? : Y
                            measure response times? : N
                                       verify read? : Y
                                           verbose? : False
                                     log to stderr? : False
                                      ext.attr.size : 0
                                     ext.attr.count : 0
                          permute host directories? : N
                           remote program directory :             /root/smallfile-master
                          network thread sync. dir. :             /var/www/test/network_shared
           starting all threads by creating starting gate file             /var/www/test/network_shared/starting_gate.tmp
           host = 192.168.140.41,thr = 00,elapsed = 0.962424,files =             500,records = 0,status = ok
           host = 192.168.140.41,thr = 01,elapsed = 0.942673,files =             500,records = 0,status = ok
           host = 192.168.140.41,thr = 02,elapsed = 0.940622,files =             500,records = 0,status = ok
           host = 192.168.140.41,thr = 03,elapsed = 0.915218,files =             500,records = 0,status = ok
           host = 192.168.140.41,thr = 04,elapsed = 0.934349,files =             500,records = 0,status = ok
           host = 192.168.140.41,thr = 05,elapsed = 0.922466,files =             500,records = 0,status = ok
           host = 192.168.140.41,thr = 06,elapsed = 0.954381,files =             500,records = 0,status = ok
           host = 192.168.140.41,thr = 07,elapsed = 0.946127,files =             500,records = 0,status = ok
           total threads = 8
           total files = 4000
           100.00% of requested files processed, minimum is  70.00
           0.962424 sec elapsed time
           4156.173189 files/sec

           -----Original             message-----
             From: Jo Goossens             <jo.goossens@xxxxxxxxxxxxxxxx>
             Sent: Tue 11-07-2017 11:26
             Subject: Re: [Gluster-users] Gluster native             mount is really slow compared to nfs
             To: gluster-users@xxxxxxxxxxx; Soumya             Koduri <skoduri@xxxxxxxxxx>; 
             CC: Ambarish Soman             <asoman@xxxxxxxxxx>; 
                                       Hi all,

               One more thing, we have 3 apps servers with the gluster                 on it, replicated on 3 different gluster nodes. (So the                 gluster nodes are app servers at the same time). We                 could actually almost work locally if we wouldn't need                 to have the same files on the 3 nodes and redundancy :)

               Initial cluster was created like this:

               gluster volume create www replica 3 transport tcp                 192.168.140.41:/gluster/www 192.168.140.42:/gluster/www                 192.168.140.43:/gluster/www force
               gluster volume set www network.ping-timeout 5
               gluster volume set www performance.cache-size 1024MB
               gluster volume set www nfs.disable on # No need for                 NFS currently
               gluster volume start www

               To my understanding it still wouldn't explain why nfs                 has such great performance compared to native ...

               Regards
               Jo

               -----Original                 message-----
                 From: Soumya Koduri                 <skoduri@xxxxxxxxxx>
                 Sent: Tue 11-07-2017 11:16
                 Subject: Re: [Gluster-users] Gluster                 native mount is really slow compared to nfs
                 To: Jo Goossens                 <jo.goossens@xxxxxxxxxxxxxxxx>;                 gluster-users@xxxxxxxxxxx; 
                 CC: Ambarish Soman                 <asoman@xxxxxxxxxx>; Karan Sandha                 <ksandha@xxxxxxxxxx>; 
                 + Ambarish

                 On 07/11/2017 02:31 PM, Jo Goossens wrote:
                 > Hello,
                 >
                 >
                 >
                 >
                 >
                 > We tried tons of settings to get a php app running                 on a native gluster
                 > mount:
                 >
                 >
                 >
                 > e.g.: 192.168.140.41:/www /var/www glusterfs
                 > defaults,_netdev,backup-volfile-servers=192.168.140.42:192.168.140.43,direct-io-mode=disable
                 > 0 0
                 >
                 >
                 >
                 > I tried some mount variants in order to speed up                 things without luck.
                 >
                 >
                 >
                 >
                 >
                 > After that I tried nfs (native gluster nfs 3 and                 ganesha nfs 4), it was
                 > a crazy performance difference.
                 >
                 >
                 >
                 > e.g.: 192.168.140.41:/www /var/www nfs4                 defaults,_netdev 0 0
                 >
                 >
                 >
                 > I tried a test like this to confirm the slowness:
                 >
                 >
                 >
                 > ./smallfile_cli.py  --top /var/www/test --host-set                 192.168.140.41
                 > --threads 8 --files 5000 --file-size 64                 --record-size 64
                 >
                 > This test finished in around 1.5 seconds with NFS                 and in more than 250
                 > seconds without nfs (can't remember exact numbers,                 but I reproduced it
                 > several times for both).
                 >
                 > With the native gluster mount the php app had                 loading times of over 10
                 > seconds, with the nfs mount the php app loaded                 around 1 second maximum
                 > and even less. (reproduced several times)
                 >
                 >
                 >
                 > I tried all kind of performance settings and                 variants of this but not
                 > helped , the difference stayed huge, here are some                 of the settings
                 > played with in random order:
                 >

                 Request Ambarish & Karan (cc'ed who have been                 working on evaluating 
                 performance of various access protocols gluster                 supports) to look at the 
                 below settings and provide inputs.

                 Thanks,
                 Soumya

                 >
                 >
                 > gluster volume set www features.cache-invalidation                 on
                 > gluster volume set www                 features.cache-invalidation-timeout 600
                 > gluster volume set www performance.stat-prefetch on
                 > gluster volume set www                 performance.cache-samba-metadata on
                 > gluster volume set www                 performance.cache-invalidation on
                 > gluster volume set www performance.md-cache-timeout                 600
                 > gluster volume set www network.inode-lru-limit                 250000
                 >
                 > gluster volume set www                 performance.cache-refresh-timeout 60
                 > gluster volume set www performance.read-ahead                 disable
                 > gluster volume set www performance.readdir-ahead on
                 > gluster volume set www performance.parallel-readdir                 on
                 > gluster volume set www                 performance.write-behind-window-size 4MB
                 > gluster volume set www performance.io-thread-count                 64
                 >
                 > gluster volume set www                 performance.client-io-threads on
                 >
                 > gluster volume set www performance.cache-size 1GB
                 > gluster volume set www performance.quick-read on
                 > gluster volume set www performance.flush-behind on
                 > gluster volume set www performance.write-behind on
                 > gluster volume set www nfs.disable on
                 >
                 > gluster volume set www client.event-threads 3
                 > gluster volume set www server.event-threads 3
                 >
                 >
                 >
                 >
                 >
                 >
                 > The NFS ha adds a lot of complexity which we                 wouldn't need at all in our
                 > setup, could you please explain what is going on                 here? Is NFS the only
                 > solution to get acceptable performance? Did I miss                 one crucial settting
                 > perhaps?
                 >
                 >
                 >
                 > We're really desperate, thanks a lot for your help!
                 >
                 >
                 >
                 >
                 >
                 > PS: We tried with gluster 3.11 and 3.8 on Debian,                 both had terrible
                 > performance when not used with nfs.
                 >
                 >
                 >
                 >
                 >
                 >
                 >
                 > Kind regards
                 >
                 > Jo Goossens
                 >
                 >
                 >
                 >
                 >
                 >
                 >
                 >
                 >
                 > _______________________________________________
                 > Gluster-users mailing list
                 > Gluster-users@xxxxxxxxxxx
                 >                 http://lists.gluster.org/mailman/listinfo/gluster-users
                 >

              _______________________________________________  Gluster-users mailing list  Gluster-users@xxxxxxxxxxx  http://lists.gluster.org/mailman/listinfo/gluster-users

         _______________________________________________  Gluster-users mailing list  Gluster-users@xxxxxxxxxxx  http://lists.gluster.org/mailman/listinfo/gluster-users

       _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://lists.gluster.org/mailman/listinfo/gluster-users

  _______________________________________________
 Gluster-users mailing list
 Gluster-users@xxxxxxxxxxx
 http://lists.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users