Re: Performance is falling rapidly when updating from v5.5 to v7.0

RAFI KC <rkavunga@xxxxxxxxxx> · Wed, 6 Nov 2019 15:46:30 +0530

    On 11/6/19 3:42 PM, David Spisla wrote:

        Hello Rafi,

        I tried to set the xattr via

        setfattr -n trusted.io-stats-dump -v '/tmp/iostat.log'
          /gluster/repositories/repo1/

        but it had no effect. There is no such a xattr via getfattr
          and no logfile. The command setxattr is not available. What I
          am doing wrong?

    I will check it out and get back to you.

        By the way, you mean to increase the inode size of xfs
          layer from 512 Bytes to 1024KB(!)? I think it should be 1024
          Bytes because 2048 Bytes is the maximum

    It was a type, I meant to set up 1024 bytes, sorry for that.

        Regards
        David

        Am Mi., 6. Nov. 2019 um
          04:10 Uhr schrieb RAFI KC <rkavunga@xxxxxxxxxx>:

            I will take a look at the profile info shared. Since
              there is a huge difference in the performance numbers
              between fuse and samba, it would be great if we can get
              the profile info of fuse (on v7). This will help to
              compare the number of calls for each fops. There should be
              some fops that samba repeat, and we can find out it by
              comparing with fuse.
            Also if possible, can you please get client profile info
              from fuse mount using the command `setxattr -n
              trusted.io-stats-dump -v <logfile /tmp/iostat.log>
              </mnt/fuse(mount point)>`.

            Regards
            Rafi KC

            On 11/5/19 11:05 PM, David Spisla wrote:

                I did the test with Gluster 7.0 ctime disabled. But
                  it had no effect:

                  (All values in MiB/s)

                   64KiB    1MiB     10MiB
                  0,16       2,60       54,74

                  Attached there is now the complete profile file
                    also with the results from the last test. I will not
                    repeat it with an higher inode size because I don't
                    think this will have an effect.
                  There must be another cause for the low
                    performance

            Yes. No need to try with higher inode size

                  Regards
                  David Spisla

                Am Di., 5. Nov. 2019
                  um 16:25 Uhr schrieb David Spisla <spisla80@xxxxxxxxx>:

                      Am Di., 5. Nov.
                        2019 um 12:06 Uhr schrieb RAFI KC <rkavunga@xxxxxxxxxx>:

                          On 11/4/19 8:46 PM, David Spisla wrote:

                                  Dear Gluster Community,

                                  I also have a issue concerning
                                    performance. The last days I updated
                                    our test cluster from GlusterFS v5.5
                                    to v7.0 . The setup in general:

                                  2 HP DL380 Servers with 10Gbit
                                    NICs, 1 Distribute-Replica 2 Volume
                                    with 2 Replica Pairs. Client is SMB
                                    Samba (access via vfs_glusterfs) . I
                                    did several tests to ensure that
                                    Samba don't causes the fall.
                                  The setup ist completely the same
                                    except the Gluster Version

                                  Here are my results:
                                  64KiB           1MiB            
                                    10MiB            (Filesize)

                                      3,49             47,41       
                                            300,50          (Values in
                                        MiB/s with GlusterFS v5.5) 

                                      0,16              2,61 
                                                   76,63           
                                        (Values in MiB/s with GlusterFS
                                        v7.0) 

                          Can you please share the profile
                            information [1] for both versions?  Also it
                            would be really helpful if you can mention
                            the io patterns that used for this tests.

                          [1] : https://docs.gluster.org/en/latest/Administrator%20Guide/Monitoring%20Workload/

                      Hello Rafi,

                      thank you for your help.

                      * First more information about the io
                        patterns: As a client we use a DL360 Windws
                        Server 2017 machine with 10Gbit NIC connected to
                        the storage machines. The share will be mounted
                        via SMB and the tests writes with fio. We use
                        this job files (see attachment). Each job file
                        will be executed separetely and there is a sleep
                        about 60s between each test run to calm down the
                        system before starting a new test.

                      * Attached below you find the profile output
                        from the tests with v5.5 (ctime enabled), v7.0
                        (ctime enabled).

                        * Beside of the tests with Samba I did also
                          some fio tests directly on the FUSE Mounts
                          (locally on one of the storage nodes). The
                          results show that there is only a small
                          decrease of performance between v5.5 and v7.0

                        (All values in MiB/s)

                         64KiB    1MiB     10MiB

                          50,09     679,96   1023,02 (v5.5)

                          47,00     656,46    977,60 (v7.0)

                        It seems to be that the combination of
                          samba + gluster7.0 has a lot of problems, or
                          not?

                                      We use this volume options
                                        (GlusterFS 7.0):

                                      Volume Name: archive1

                                        Type: Distributed-Replicate

                                        Volume ID:
                                        44c17844-0bd4-4ca2-98d8-a1474add790c

                                        Status: Started

                                        Snapshot Count: 0

                                        Number of Bricks: 2 x 2 = 4

                                        Transport-type: tcp

                                        Bricks:

                                        Brick1:
                                        fs-dl380-c1-n1:/gluster/brick1/glusterbrick

                                        Brick2:
                                        fs-dl380-c1-n2:/gluster/brick1/glusterbrick

                                        Brick3:
                                        fs-dl380-c1-n1:/gluster/brick2/glusterbrick

                                        Brick4:
                                        fs-dl380-c1-n2:/gluster/brick2/glusterbrick

                                        Options Reconfigured:

                                        performance.client-io-threads:
                                        off

                                        nfs.disable: on

                                        storage.fips-mode-rchecksum: on

                                        transport.address-family: inet

                                        user.smb: disable

                                        features.read-only: off

                                        features.worm: off

                                        features.worm-file-level: on

                                        features.retention-mode:
                                        enterprise

features.default-retention-period: 120

                                        network.ping-timeout: 10

                                        features.cache-invalidation: on

features.cache-invalidation-timeout: 600

                                        performance.nl-cache: on

                                        performance.nl-cache-timeout:
                                        600

                                        client.event-threads: 32

                                        server.event-threads: 32

                                        cluster.lookup-optimize: on

                                        performance.stat-prefetch: on

                                        performance.cache-invalidation:
                                        on

                                        performance.md-cache-timeout:
                                        600

performance.cache-samba-metadata: on

                                        performance.cache-ima-xattrs: on

                                        performance.io-thread-count: 64

                                        cluster.use-compound-fops: on

                                        performance.cache-size: 512MB

performance.cache-refresh-timeout: 10

                                        performance.read-ahead: off

performance.write-behind-window-size: 4MB

                                        performance.write-behind: on

                                        storage.build-pgfid: on

                                        features.ctime: on

                                        cluster.quorum-type: fixed

                                        cluster.quorum-count: 1

                                        features.bitrot: on

                                        features.scrub: Active

                                        features.scrub-freq: daily

                                  For GlusterFS 5.5 its nearly the
                                    same except the fact that there were
                                    2 options to enable ctime feature. 

                            Ctime stores additional metadata information
                            as an extended attributes which sometimes
                            exceeds the default inode size. In such
                            scenarios the additional xattrs won't fit
                            into the default size. This will result in
                            additional blocks to be used to store xattrs
                            in the inide, which will effect the latency.
                            This is purely based on the i/o operations
                            and the total xattrs size stored in the
                            inode.

                            Is it possible for you to repeat the test by
                            disabling ctime or increasing the inode size
                            to a higher value say 1024KB?

                      I will do so but for today I could not finish
                        tests with ctime disabled (or higher inode
                        value) because it takes a lot of time with v7.0
                        due to the low performance and I will perform it
                        tomorrow. As soon as possible I give you the
                        results.
                      By the way: You really mean inode size on xfs
                        layer 1024KB? Or do you mean 1024Bytes? We use
                        per default 512Bytes, because this is the
                        recommended size until now . But it seems to be
                        that there is a need for a new recommendation
                        when using ctime feature as a default. I can not
                        image that this is the real cause for the low
                        performance because in v5.5 we also use ctime
                        feature with inode size 512Bytes.

                      Regards
                      David

                                  Our optimization for Samba looks
                                    like this (for every version):

                                  [global]

                                    workgroup = SAMBA

                                    netbios name = CLUSTER

                                    kernel share modes = no

                                    aio read size = 1

                                    aio write size = 1

                                    kernel oplocks = no

                                    max open files = 100000

                                    nt acl support = no

                                    security = user

                                    server min protocol = SMB2

                                    store dos attributes = no

                                    strict locking = no

                                    full_audit:failure = pwrite_send
                                    pwrite_recv pwrite
                                    offload_write_send
                                    offload_write_recv create_file open
                                    unlink connect disconnect rename
                                    chown fchown lchown chmod fchmod
                                    mkdir rmdir ntimes ftruncate
                                    fallocate 

                                    full_audit:success = pwrite_send
                                    pwrite_recv pwrite
                                    offload_write_send
                                    offload_write_recv create_file open
                                    unlink connect disconnect rename
                                    chown fchown lchown chmod fchmod
                                    mkdir rmdir ntimes ftruncate
                                    fallocate 

                                    full_audit:facility = local5

                                    durable handles = yes

                                    posix locking = no

                                    log level = 2

                                    max log size = 100000

                                    debug pid = yes

                                  What can be the cause for this
                                    rapid falling of the performance for
                                    small files? Are some of our vol
                                    options not recommended anymore? 

                                  There were some patches
                                    concerning performance for small
                                    files in v6.0 und v7.0 :

                                    #1670031: performance
                                          regression seen with smallfile
                                          workload tests

                                    #1659327:
                                          43% regression in small-file
                                          sequential read performance
                                    And
                                      one patch for the io-cache:

                                    #1659869:
                                          improvements to io-cache
                                    Regards
                                    David
                                      Spisla

                            ________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314

Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users

________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314

Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users