Re: Cascading errors and very bad write performance

Geoffrey Letessier <geoffrey.letessier@xxxxxxx> · Fri, 7 Aug 2015 23:47:09 +0200

I’m not really sure to well understand your answer.

I try to set inode-lru-limit to 1, I can not notice any good effect. 

When i re-run ddt application, I can note 2 kinds of messages:
[2015-08-07 21:29:21.792156] W [marker-quota.c:3379:_mq_initiate_quota_txn] 0-vol_home-marker: parent is NULL for <gfid:5a32328a-7fd9-474e-9bc6-cafde9c41af7>, aborting updation txn
[2015-08-07 21:29:21.792176] W [marker-quota.c:3379:_mq_initiate_quota_txn] 0-vol_home-marker: parent is NULL for <gfid:5a32328a-7fd9-474e-9bc6-cafde9c41af7>, aborting updation txn

and/or:
[2015-08-07 21:44:19.279971] E [marker-quota.c:2990:mq_start_quota_txn_v2] 0-vol_home-marker: contribution node list is empty (31d7bf88-b63a-4731-a737-a3dce73b8cd1)
[2015-08-07 21:41:26.177095] E [dict.c:1418:dict_copy_with_ref] (-->/usr/lib64/glusterfs/3.7.3/xlator/protocol/server.so(server_resolve_inode+0x60) [0x7f85e9a6a410] -->/usr/lib64/glusterfs/3.7.3/xlator/protocol/server.so(resolve_gfid+0x88) [0x7f85e9a6a188] -->/usr/lib64/libglusterfs.so.0(dict_copy_with_ref+0xa4) [0x3e99c20674] ) 0-dict: invalid argument: dict [Argument invalide]

And concerning the bad IO performance?

[letessier@node031 ~]$ ddt -t 35g /home/admin_team/letessier/
Writing to /home/admin_team/letessier/ddt.25259 ... syncing ... done.
sleeping 10 seconds ... done.
Reading from /home/admin_team/letessier/ddt.25259 ... done.
35840MiB    KiB/s  CPU%
Write      277451     3
Read       188682     1
[letessier@node031 ~]$ logout
[root@node031 ~]# ddt -t 35g /home/
Writing to /home/ddt.25559 ... syncing ... done.
sleeping 10 seconds ... done.
Reading from /home/ddt.25559 ... done.
35840MiB    KiB/s  CPU%
Write      196539     2
Read       438944     3
Notice the read/write throughput differences when i’m root and when i’m a simple user.

Thanks.
Geoffrey

------------------------------------------------------
Geoffrey Letessier
Responsable informatique & ingénieur système
UPR 9080 - CNRS - Laboratoire de Biochimie Théorique
Institut de Biologie Physico-Chimique
13, rue Pierre et Marie Curie - 75005 Paris
Tel: 01 58 41 50 93 - eMail: geoffrey.letessier@xxxxxxx

Le 7 août 2015 à 14:57, Vijaikumar M <vmallika@xxxxxxxxxx> a écrit :

    On Friday 07 August 2015 05:34 PM,
      Geoffrey Letessier wrote:

      Hi Vijay, 

      My brick logs issue and big performance problem have begun
        when I upgraded Gluster into 3.7.3 version; before write
        throughput was good enough (~500MBs) -but not as good as with
        GlusterFS 3.5.3 (especially with distributed volumes)- and
        didn’t notice these problème with brick-logs.

      OK… in live:

      i just disable to quota for my home volume and now my
        performance appears to be relatively better (around 300MBs) but
        i still see the logs (from storage1 and its replicate storage2)
        growing up with only this kind of lines:
      [2015-08-07
          11:16:51.746142] E [dict.c:1418:dict_copy_with_ref]
          (-->/usr/lib64/glusterfs/3.7.3/xlator/protocol/server.so(server_resolve_inode+0x60)
          [0x7f85e9a6a410]
          -->/usr/lib64/glusterfs/3.7.3/xlator/protocol/server.so(resolve_gfid+0x88)
          [0x7f85e9a6a188]
          -->/usr/lib64/libglusterfs.so.0(dict_copy_with_ref+0xa4)
          [0x3e99c20674] ) 0-dict: invalid argument: dict [Argument
          invalide]

    We have root caused log issue,  bug# 1244613 tracks this issue

      After a few minutes: my write throughput seems to be now
        correct (~550MBs) but the log are still growing up (to not say
        exploding). So one part of the problem looks like taking its
        origin in the quota system management.
      … after a few minutes (and still only 1 client connected),
        now it is the read operation which is very very slow… -I’m gonna
        become crazy! :/-

        # ddt -t 50g /home/
        Writing to /home/ddt.11293 ...
            syncing ... done.
        sleeping 10 seconds ... done.
        Reading from /home/ddt.11293 ...
            done.
        35840MiB    KiB/s  CPU%
        Write      568201     5
        Read       567008     4
        # ddt -t 50g /home/
        Writing to /home/ddt.11397 ...
            syncing ... done.
        sleeping 10 seconds ... done.
        Reading from /home/ddt.11397 ...
            done.
        51200MiB    KiB/s  CPU%
        Write      573631     5
        Read       164716     1

      and my log are still exploding…

      After having re-enabled the quota on my volume: 

        # ddt -t 50g /home/
        Writing to /home/ddt.11817 ...
            syncing ... done.
        sleeping 10 seconds ... done.
        Reading from /home/ddt.11817 ...
            done.
        51200MiB    KiB/s  CPU%
        Write      269608     3
        Read       160219     1

      Thanks 
      Geoffrey 

        ------------------------------------------------------

        Geoffrey Letessier

        Responsable informatique & ingénieur système

        UPR 9080 - CNRS - Laboratoire de Biochimie Théorique

        Institut de Biologie Physico-Chimique

        13, rue Pierre et Marie Curie - 75005 Paris

        Tel: 01 58 41 50 93 - eMail: geoffrey.letessier@xxxxxxx

        Le 7 août 2015 à 06:28, Vijaikumar M <vmallika@xxxxxxxxxx>
          a écrit :

           Hi Geoffrey,

            Some performance improvements has been done in
              quota in glusterfs-3.7.3.

            Could you upgrade to glusterfs-3.7.3 and see if
              this helps

            Thanks,

            Vijay

            On Friday 07 August 2015 05:02
              AM, Geoffrey Letessier wrote:

              Hi,

              No idea to help me fix this issue? (big logs, small
                write performance (/4), etc.)

              For comparison, here to volumes: 

                - home: distributed on 4 bricks / 2 nodes  (and
                replicated on 4 other bricks / 2 other nodes):

                # ddt -t 35g /home
                Writing to /home/ddt.24172
                    ... syncing ... done.
                sleeping 10 seconds ...
                    done.
                Reading from /home/ddt.24172
                    ... done.
                33792MiB    KiB/s  CPU%
                Write      103659     1
                Read       391955     3

                - workdir: distributed on 4 bricks / 2 nodes (one
                the same RAID volumes and servers than home):

                # ddt -t 35g /workdir
                Writing to
                    /workdir/ddt.24717 ... syncing ... done.
                sleeping 10 seconds ...
                    done.
                Reading from
                    /workdir/ddt.24717 ... done.
                35840MiB    KiB/s  CPU%
                Write      738314     4
                Read       536497     4

              For information, previously on 3.5.3-2 version, I
                obtained roughly 1.1GBs for workdir volume and
                ~550-600MBs for home.

              All my tests (CP, RSYNC, etc.) provides me the same
                result (write throughput between 100MBs and 150MBs)

              Thanks.
              Geoffrey

                  ------------------------------------------------------

                  Geoffrey Letessier

                  Responsable informatique & ingénieur système

                  UPR 9080 - CNRS - Laboratoire de Biochimie Théorique

                  Institut de Biologie Physico-Chimique

                  13, rue Pierre et Marie Curie - 75005 Paris

                  Tel: 01 58 41 50 93 - eMail: geoffrey.letessier@xxxxxxx

                  Le 5 août 2015 à 10:40, Geoffrey Letessier <geoffrey.letessier@xxxxxxx>

                    a écrit :

                    Hello,

                      In addition, knowing I have reactivated the
                        log (brick-log-level = INFO not CRITICAL) only
                        for the file creation duration (i.e. a few
                        minutes), do you have noticed the log sizes and
                        the number of lines inside:

                        #
                            ls -lh storage*

                          -rw------- 
                              1 letessier  staff    18M  5 aoû 00:54
                              storage1__export-brick_home-brick1-data.log
                          -rw------- 
                              1 letessier  staff   2,1K  5 aoû 00:54
                              storage1__export-brick_home-brick2-data.log
                          -rw------- 
                              1 letessier  staff    15M  5 aoû 00:56
                              storage2__export-brick_home-brick1-data.log
                          -rw------- 
                              1 letessier  staff   2,1K  5 aoû 00:54
                              storage2__export-brick_home-brick2-data.log
                          -rw------- 
                              1 letessier  staff    47M  5 aoû 00:55
                              storage3__export-brick_home-brick1-data.log
                          -rw------- 
                              1 letessier  staff   2,1K  5 aoû 00:54
                              storage3__export-brick_home-brick2-data.log
                          -rw------- 
                              1 letessier  staff    47M  5 aoû 00:55
                              storage4__export-brick_home-brick1-data.log
                          -rw------- 
                              1 letessier  staff   2,1K  5 aoû 00:55
                              storage4__export-brick_home-brick2-data.log

                        #
                            wc -l storage*
                           55381

                            storage1__export-brick_home-brick1-data.log

                                17
                            storage1__export-brick_home-brick2-data.log
                           41636

                            storage2__export-brick_home-brick1-data.log

                                17
                            storage2__export-brick_home-brick2-data.log
                          270360

                            storage3__export-brick_home-brick1-data.log

                                17
                            storage3__export-brick_home-brick2-data.log
                          270358

                            storage4__export-brick_home-brick1-data.log

                                17
                            storage4__export-brick_home-brick2-data.log
                          637803
                            total

                        If the let brick-log-level to INFO, the
                          brick log files in each server will consume
                          all my /var partition capacity within only a
                          few hours/days…

                        Thanks in advance,
                        Geoffrey

                          ------------------------------------------------------

                          Geoffrey Letessier

                          Responsable informatique &
                          ingénieur système

                          UPR 9080 - CNRS - Laboratoire de Biochimie
                          Théorique

                          Institut de Biologie Physico-Chimique

                          13, rue Pierre et Marie Curie - 75005 Paris

                          Tel: 01 58 41 50 93 - eMail: geoffrey.letessier@xxxxxxx

                          Le 5 août 2015 à 01:12, Geoffrey
                            Letessier <geoffrey.letessier@xxxxxxx>

                            a écrit :

                            Hello,

                              Since the problem motioned previously
                                (all errors noticed in brick log files),
                                i notice a very very bad performance: i
                                can note my write performance divided by
                                4 than previously -knowing it was not so
                                good before.
                              Now, a write of a 33GB file, my write
                                throughput is around 150MBs (with
                                Infiniband), before it was around
                                550-600MBs; and this, both with RDMA and
                                TCP protocol.

                              During this test, more than 40 000
                                error lines (as the following) were
                                added to the brick log files.

                                [2015-08-04
                                  22:34:27.337622] E
                                  [dict.c:1418:dict_copy_with_ref]
                                  (-->/usr/lib64/glusterfs/3.7.3/xlator/protocol/server.so(server_resolve_inode+0x60)

                                  [0x7f021c6f7410]
                                  -->/usr/lib64/glusterfs/3.7.3/xlator/protocol/server.so(resolve_gfid+0x88)

                                  [0x7f021c6f7188]
                                  -->/usr/lib64/libglusterfs.so.0(dict_copy_with_ref+0xa4)
                                  [0x7f0229cba674] ) 0-dict: invalid
                                  argument: dict [Argument invalide]

                              All brick log files are in
                                attachments.

                              Thanks in advance for all your help
                                and fix,
                              Best,
                              Geoffrey

                              PS: question: is it possible to
                                easily downgrade GlusterFS to a previous
                                version from 3.7 (for example: v3.5)?

------------------------------------------------------

                                  Geoffrey Letessier

                                  Responsable informatique &
                                  ingénieur système

                                  UPR 9080 - CNRS - Laboratoire
                                  de Biochimie Théorique

                                  Institut de Biologie Physico-Chimique

                                  13, rue Pierre et Marie Curie -
                                  75005 Paris

                                  Tel: 01 58 41 50 93 - eMail: geoffrey.letessier@xxxxxxx

                            <bricks-logs.tgz>

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users