Re: Quota issue

Geoffrey Letessier <geoffrey.letessier@xxxxxxx> · Thu, 11 Jun 2015 14:13:30 +0200

Hi Vijay,
Could you take a time to take a look at this? I found only one thing about my issues in Red Hat bugzilla (https://bugzilla.redhat.com/show_bug.cgi?id=917901) But, my storage & computing clusters are still in production now and I wonder if I should warn my community about of a needed production break or can I  apply a fix during production? (i.e. without updating my GlusterFS version on my storage cluster).

Thanks in advance,
Geoffrey

------------------------------------------------------
Geoffrey Letessier
Responsable informatique & ingénieur système
UPR 9080 - CNRS - Laboratoire de Biochimie Théorique
Institut de Biologie Physico-Chimique
13, rue Pierre et Marie Curie - 75005 Paris
Tel: 01 58 41 50 93 - eMail: geoffrey.letessier@xxxxxxx

Le 10 juin 2015 à 06:12, Vijaikumar M <vmallika@xxxxxxxxxx> a écrit :

    Hi Geoffrey,

    grep for 'ERROR' from the log file, and only these lines
      would be sufficient.

    Thanks,

    Vijay

    On Wednesday 10 June 2015 04:38 AM,
      Geoffrey Letessier wrote:

      Hello Vijay,

      Quota-verify is still running since a couple of hours (more
        than 10) and each output file sizes (4 files because 4 bricks
        per replica) are very huge: around 800MB per file in the first
        server and 5GB per file in the second one. Do your still want
        these? How can I send it to you?

      Nice night (in France)
      Geoffrey

          ------------------------------------------------------

          Geoffrey Letessier

          Responsable informatique & ingénieur système

          UPR 9080 - CNRS - Laboratoire de Biochimie Théorique

          Institut de Biologie Physico-Chimique

          13, rue Pierre et Marie Curie - 75005 Paris

          Tel: 01 58 41 50 93 - eMail: geoffrey.letessier@xxxxxxx

          Le 9 juin 2015 à 12:46, Vijaikumar M <vmallika@xxxxxxxxxx>
            a écrit :

             Hi Geoffrey,

              The file content deletion is because of 'vi
                editor' behaviour of truncating the file when writing
                the updated content.

              Regarding quota size/usage problem, can you
                please execute the script attached on each brick and
                provide us the output generated, this will help us
                analyse why quota list is showing wrong-size.

              The script basically crawls the directory given as
                argument.

              It collects quota "contri" and "size" extended
                attribute and also "block size" from stat call.

              Usage:

              ./quota-verify -b <brick_path> | tee
                brick_name.log

              Thanks,

              Vijay

              On Tuesday 09 June 2015 03:45
                PM, Vijaikumar M wrote:

                On Tuesday 09 June 2015
                  03:40 PM, Geoffrey Letessier wrote:

                  Hi Vijay,

                  Thanks for having replied.

                  Unfortunately, i check each bricks on my
                    stockage pool and dont find any backup file..
                    damage!

                Please check backup file on client machine
                  where the file was edited and on the home dir of a
                  user (this is the user login used to edit a
                  file).

                Thanks,

                Vijay

                  Thank you again!
                  Good luck and see you,
                  Geoffrey

                      ------------------------------------------------------

                        Geoffrey Letessier

                        Responsable informatique & ingénieur système

                        UPR 9080 - CNRS - Laboratoire de
                        Biochimie Théorique

                        Institut de Biologie Physico-Chimique

                        13, rue Pierre et Marie Curie - 75005 Paris

                        Tel: 01 58 41 50 93 - eMail: geoffrey.letessier@xxxxxxx

                        Le 9 juin 2015 à 10:05, Vijaikumar
                          M <vmallika@xxxxxxxxxx>

                          a écrit :

                            On Tuesday 09
                              June 2015 01:08 PM, Geoffrey Letessier
                              wrote:

                              Hi,

                              Yes of course:

                                [root@lucifer
                                  ~]# pdsh -w cl-storage[1,3] du -s
                                  /export/brick_home/brick*/amyloid_team
                                cl-storage1:
                                  1608522280 /export/brick_home/brick1/amyloid_team
                                cl-storage3:
                                  1619630616 /export/brick_home/brick1/amyloid_team
                                cl-storage1:
                                  1614057836 /export/brick_home/brick2/amyloid_team
                                cl-storage3:
                                  1602653808 /export/brick_home/brick2/amyloid_team

                                The sum is: 6444864540
                                  (around 6.4-6.5TB) while the quota
                                  list displays 7.7TB.
                                So, the mistake is roughly
                                  1.2-1.3TB, in other words around 16%
                                  -which is too huge, no?

                                In addition, since the
                                  quota is exceeded, i note a lot of
                                  files like following:

                                  [root@lucifer ~]# pdsh -w
                                    cl-storage[1,3] "cd
                                    /export/brick_home/brick2/amyloid_team/tarus/project/ab1-40-x1_sen304-x2_inh3-x2/remd_charmm22star_scripts/;

                                    ls -ail remd_100.sh 2> /dev/null"
                                    2>/dev/null
                                  cl-storage3: 133325688
                                    ---------T 2 tarus amyloid_team 0 16
                                    févr. 10:20 remd_100.sh

                                note the ’T’ at the end of
                                  perms and the file size to 0B.

                                And, yesterday, some files
                                  were duplicated but not anymore...

                                The worst is, previously,
                                  all these files were OK. In other
                                  words, exceeding quota made file or
                                  content deletions or corruptions… What
                                  can I do to prevent to situation for
                                  the futur -because I guess i cannot do
                                  something to rollback this situation
                                  now, right?

                              Hi Geoffrey,

                            I tried re-creating the
                              problem.

                              Here is the behaviour of vi editor.

                            When a file is saved in vi
                              editor, it creates a backup file under
                              home dir and opens the original file with
                              'O_TRUNC' flag and hence file was
                              truncated.

                              Here is the strace of vi editor when it
                              gets 'EDQUOT' error:

                            open("hello",
                              O_WRONLY|O_CREAT|O_TRUNC, 0644) = 3

                            write(3, "line one\nline
                              two\n", 18)    = 18

                            fsync(3)                               

                              = 0

                            close(3)                               

                              = -1 EDQUOT (Disk quota exceeded)

                            chmod("hello",
                              0100644)                 = 0

                            open("/root/hello~",
                              O_RDONLY)          = 3

                            open("hello",
                                O_WRONLY|O_CREAT|O_TRUNC, 0644) = 7

                            read(3, "line one\n",
                              256)              = 9

                            write(7, "line one\n",
                              9)               = 9

                            read(3, "",
                              256)                        = 0

                            close(7)                               

                              = -1 EDQUOT (Disk quota exceeded)

                            close(3)                               

                              = 0

                            To re-cover the truncated
                              file, please find if there are any backup
                              file 'remd_115.sh~' under '~/' or on the
                              same dir where this file exists. If exists you can copy this
                              file.

                            Thanks,

                            Vijay

                                Geoffrey
                                ------------------------------------------------------

                                  Geoffrey
                                    Letessier

                                    Responsable informatique &
                                    ingénieur système

                                    UPR 9080 - CNRS - Laboratoire de
                                    Biochimie Théorique

                                    Institut de Biologie
                                    Physico-Chimique

                                    13, rue Pierre et Marie Curie -
                                    75005 Paris

                                    Tel: 01 58 41 50 93 - eMail: geoffrey.letessier@xxxxxxx

                                    Le 9 juin 2015 à
                                      09:01, Vijaikumar M <vmallika@xxxxxxxxxx>

                                      a écrit :

                                        On
                                          Monday 08 June 2015 07:11 PM,
                                          Geoffrey Letessier wrote:

                                          In addition, i notice a very
                                          big difference between the sum
                                          of DU on each brick and
                                          « quota list » display, as you
                                          can read below:

                                            [root@lucifer
                                              ~]# pdsh -w
                                              cl-storage[1,3] du -sh
                                              /export/brick_home/brick*/amyloid_team
                                            cl-storage1:
                                              1,6T
                                              /export/brick_home/brick1/amyloid_team
                                            cl-storage3:
                                              1,6T
                                              /export/brick_home/brick1/amyloid_team
                                            cl-storage1:
                                              1,6T
                                              /export/brick_home/brick2/amyloid_team
                                            cl-storage3:
                                              1,6T
                                              /export/brick_home/brick2/amyloid_team
                                            [root@lucifer
                                              ~]# gluster volume quota
                                              vol_home list
                                              /amyloid_team

                                                      Path              
                                                  Hard-limit Soft-limit
                                                Used  Available
                                            --------------------------------------------------------------------------------
                                            /amyloid_team 

                                                9.0TB       90%      
                                              7.8TB   1.2TB

                                            As you can
                                              notice, the sum of all
                                              bricks gives me roughly
                                              6.4TB and « quota list »
                                              around 7.8TB; so there is
                                              a difference of 1.4TB i’m
                                              not able to explain… Do
                                              you have any idea?

                                        There were few
                                          issues when quota accounting the size, we have
                                          fixed some
                                          of these issues in 3.7

                                        'df -h' will round off the
                                          values, can you please provide
                                          the output of 'df' without -h
                                          option?

                                            Thanks,
                                            Geoffrey

                                              ------------------------------------------------------

                                                Geoffrey Letessier

                                                Responsable
                                                informatique &
                                                ingénieur système

                                                UPR 9080 - CNRS -
                                                Laboratoire de
                                                Biochimie Théorique

                                                Institut de Biologie
                                                Physico-Chimique

                                                13, rue Pierre et Marie
                                                Curie - 75005 Paris

                                                Tel: 01 58 41 50 93 -
                                                eMail: geoffrey.letessier@xxxxxxx

                                                Le 8 juin
                                                  2015 à 14:30, Geoffrey
                                                  Letessier <geoffrey.letessier@xxxxxxx>

                                                  a écrit :

                                                  Hello,

                                                    Concerning
                                                      the 3.5.3 version
                                                      of GlusterFS, I
                                                      met this morning a
                                                      strange issue
                                                      writing file when
                                                      quota is
                                                      exceeded. 

                                                    One
                                                      person of my lab,
                                                      whose her quota is
                                                      exceeded (but she
                                                      didn’t know about)
                                                      try to modify a
                                                      file but, because
                                                      of exceeded quota,
                                                      she was unable to
                                                      and decided to
                                                      exit VI. Now, her
                                                      file is
                                                      empty/blank as you
                                                      can read below:

                                        we suspect 'vi'
                                          might have created tmp file
                                          before writing to a file. We
                                          are working on re-creating
                                          this problem and will update
                                          you on the same.

                                                      pdsh@lucifer:

                                                        cl-storage3: ssh
                                                        exited with exit
                                                        code 2
                                                      cl-storage1:

                                                        ---------T 2
                                                        tarus
                                                        amyloid_team 0
                                                        19 févr. 12:34
/export/brick_home/brick1/amyloid_team/tarus/project/ab1-40-x1_sen304-x2_inh3-x2/remd_charmm22star_scripts/remd_115.sh
                                                      cl-storage1:

                                                        -rwxrw-r-- 2
                                                        tarus
                                                        amyloid_team 0 
                                                        8 juin  12:38
/export/brick_home/brick2/amyloid_team/tarus/project/ab1-40-x1_sen304-x2_inh3-x2/remd_charmm22star_scripts/remd_115.sh

                                                      In
                                                        addition, i dont
                                                        understand why,
                                                        my volume being
                                                        a distributed
                                                        volume inside
                                                        replica
                                                        (cl-storage[1,3]
                                                        is replicated
                                                        only on
                                                        cl-storage[2,4]),
                                                        i have 2
                                                        « same » files
                                                        (complete path)
                                                        in 2 different
                                                        bricks (as you
                                                        can read above).

                                                      Thanks
                                                        by advance for
                                                        your help and
                                                        clarification.
                                                      Geoffrey

                                                        ------------------------------------------------------

                                                          Geoffrey
                                                          Letessier

                                                          Responsable
                                                          informatique &
                                                          ingénieur
                                                          système

                                                          UPR 9080 -
                                                          CNRS -
                                                          Laboratoire de
Biochimie Théorique

                                                          Institut de
                                                          Biologie
                                                          Physico-Chimique

                                                          13, rue Pierre
                                                          et Marie Curie
                                                          - 75005 Paris

                                                          Tel: 01 58 41
                                                          50 93 -
                                                          eMail: geoffrey.letessier@xxxxxxx

                                                          Le
                                                          2 juin 2015 à
                                                          23:45,
                                                          Geoffrey
                                                          Letessier <geoffrey.letessier@xxxxxxx>

                                                          a écrit :

                                                          Hi Ben,

                                                          I
                                                          just check my
                                                          messages log
                                                          files, both on
                                                          client and
                                                          server, and I
                                                          dont find any
                                                          hung task you
                                                          notice on
                                                          yours.. 

                                                          As
                                                          you can read
                                                          below, i dont
                                                          note the
                                                          performance
                                                          issue in a
                                                          simple DD but
                                                          I think my
                                                          issue is
                                                          concerning a
                                                          set of small
                                                          files (tens of
                                                          thousands nay
                                                          more)…

                                                          [root@nisus

                                                          test]# ddt -t
                                                          10g /mnt/test/
                                                          Writing

                                                          to
                                                          /mnt/test/ddt.8362
                                                          ... syncing
                                                          ... done.
                                                          sleeping

                                                          10 seconds ...
                                                          done.
                                                          Reading

                                                          from
                                                          /mnt/test/ddt.8362
                                                          ... done.
                                                          10240MiB 

                                                            KiB/s  CPU%
                                                          Write 

                                                              114770  
                                                            4
                                                          Read 
                                                                40675  
                                                            4

                                                          for

                                                          info:
                                                          /mnt/test
                                                          concerns the
                                                          single v2 GlFS
                                                          volume

                                                          [root@nisus

                                                          test]# ddt -t
                                                          10g
                                                          /mnt/fhgfs/
                                                          Writing

                                                          to
                                                          /mnt/fhgfs/ddt.8380
                                                          ... syncing
                                                          ... done.
                                                          sleeping

                                                          10 seconds ...
                                                          done.
                                                          Reading

                                                          from
                                                          /mnt/fhgfs/ddt.8380
                                                          ... done.
                                                          10240MiB 

                                                            KiB/s  CPU%
                                                          Write 

                                                              102591  
                                                            1
                                                          Read 
                                                                98079  
                                                            2

                                                          Do
                                                          you have a
                                                          idea how to
                                                          tune/optimize
                                                          performance
                                                          settings?
                                                          and/or TCP
                                                          settings (MTU,
                                                          etc.)?

                                                          ---------------------------------------------------------------
                                                          |    
                                                                  | 
                                                          UNTAR  |   DU
                                                            |  FIND   |
                                                            TAR   |   RM
                                                            |
                                                          ---------------------------------------------------------------
                                                          |
                                                          single      | 
                                                          ~3m45s |  
                                                          ~43s | 
                                                            ~47s | 
                                                          ~3m10s |
                                                          ~3m15s |
                                                          ---------------------------------------------------------------
                                                          |
                                                          replicated  | 
                                                          ~5m10s |  
                                                          ~59s | 
                                                           ~1m6s | 
                                                          ~1m19s |
                                                          ~1m49s |
                                                          ---------------------------------------------------------------
                                                          |
                                                          distributed | 
                                                          ~4m18s |  
                                                          ~41s | 
                                                            ~57s | 
                                                          ~2m24s |
                                                          ~1m38s |
                                                          ---------------------------------------------------------------
                                                          |
                                                          dist-repl   | 
                                                          ~8m18s | 
                                                          ~1m4s
                                                          |  ~1m11s | 
                                                          ~1m24s |
                                                          ~2m40s |
                                                          ---------------------------------------------------------------
                                                          |
                                                          native FS   | 
                                                            ~11s |   
                                                          ~4s |  
                                                            ~2s |   
                                                          ~56s |   ~10s
                                                          |
                                                          ---------------------------------------------------------------
                                                          |
                                                          BeeGFS      |
                                                           ~3m43s |  
                                                          ~15s |  
                                                            ~3s |
                                                           ~1m33s |  
                                                          ~46s |
                                                          ---------------------------------------------------------------
                                                          |
                                                          single (v2) |
                                                            ~3m6s |  
                                                          ~14s |  
                                                           ~32s |  
                                                          ~1m2s |   ~44s
                                                          |
                                                          ---------------------------------------------------------------

                                                          for

                                                          info: 
                                                           -BeeGFS is a
                                                          distributed FS
                                                          (4 bricks, 2
                                                          bricks per
                                                          server and 2
                                                          servers)
                                                           - single (v2):
                                                          simple gluster
                                                          volume with
                                                          default
                                                          settings

                                                          I
                                                          also note I
                                                          obtain the
                                                          same tar/untar
                                                          performance
                                                          issue with
                                                          FhGFS/BeeGFS
                                                          but the rest
                                                          (DU, FIND, RM)
                                                          looks like to
                                                          be OK.

                                                          Thank

                                                          you very much
                                                          for your reply
                                                          and help.
                                                          Geoffrey

                                                          -----------------------------------------------

                                                          Geoffrey
                                                          Letessier

                                                          Responsable
                                                          informatique
                                                          &
                                                          ingénieur
                                                          système

                                                          CNRS - UPR
                                                          9080 -
                                                          Laboratoire
                                                          de Biochimie
                                                          Théorique

                                                          Institut de
                                                          Biologie
                                                          Physico-Chimique

                                                          13, rue Pierre
                                                          et Marie Curie
                                                          - 75005 Paris

                                                          Tel: 01 58 41
                                                          50 93 -
                                                          eMail: geoffrey.letessier@xxxxxxx

                                                          Le

                                                          2 juin 2015 à
                                                          21:53, Ben
                                                          Turner <bturner@xxxxxxxxxx>

                                                          a écrit :

                                                          I
                                                          am seeing
                                                          problems on
                                                          3.7 as well.
                                                           Can you check
                                                          /var/log/messages

                                                          on both the
                                                          clients and
                                                          servers for
                                                          hung tasks
                                                          like:

                                                          Jun  2
                                                          15:23:14
                                                          gqac006
                                                          kernel: "echo
                                                          0 >
                                                          /proc/sys/kernel/hung_task_timeout_secs"
                                                          disables this
                                                          message.

                                                          Jun  2
                                                          15:23:14
                                                          gqac006
                                                          kernel: iozone
                                                                 D
                                                          0000000000000001
                                                              0 21999
                                                               1
                                                          0x00000080

                                                          Jun  2
                                                          15:23:14
                                                          gqac006
                                                          kernel:
                                                          ffff880611321cc8
0000000000000082
ffff880611321c18
ffffffffa027236e

                                                          Jun  2
                                                          15:23:14
                                                          gqac006
                                                          kernel:
                                                          ffff880611321c48
ffffffffa0272c10
ffff88052bd1e040
ffff880611321c78

                                                          Jun  2
                                                          15:23:14
                                                          gqac006
                                                          kernel:
                                                          ffff88052bd1e0f0
ffff88062080c7a0
ffff880625addaf8
ffff880611321fd8

                                                          Jun  2
                                                          15:23:14
                                                          gqac006
                                                          kernel: Call
                                                          Trace:

                                                          Jun  2
                                                          15:23:14
                                                          gqac006
                                                          kernel:
                                                          [<ffffffffa027236e>]
                                                          ?
                                                          rpc_make_runnable+0x7e/0x80
                                                          [sunrpc]

                                                          Jun  2
                                                          15:23:14
                                                          gqac006
                                                          kernel:
                                                          [<ffffffffa0272c10>]
                                                          ?
                                                          rpc_execute+0x50/0xa0
                                                          [sunrpc]

                                                          Jun  2
                                                          15:23:14
                                                          gqac006
                                                          kernel:
                                                          [<ffffffff810aaa21>]
                                                          ?
                                                          ktime_get_ts+0xb1/0xf0

                                                          Jun  2
                                                          15:23:14
                                                          gqac006
                                                          kernel:
                                                          [<ffffffff811242d0>]
                                                          ?
                                                          sync_page+0x0/0x50

                                                          Jun  2
                                                          15:23:14
                                                          gqac006
                                                          kernel:
                                                          [<ffffffff8152a1b3>]
io_schedule+0x73/0xc0

                                                          Jun  2
                                                          15:23:14
                                                          gqac006
                                                          kernel:
                                                          [<ffffffff8112430d>]
sync_page+0x3d/0x50

                                                          Jun  2
                                                          15:23:14
                                                          gqac006
                                                          kernel:
                                                          [<ffffffff8152ac7f>]
__wait_on_bit+0x5f/0x90

                                                          Jun  2
                                                          15:23:14
                                                          gqac006
                                                          kernel:
                                                          [<ffffffff81124543>]
wait_on_page_bit+0x73/0x80

                                                          Jun  2
                                                          15:23:14
                                                          gqac006
                                                          kernel:
                                                          [<ffffffff8109eb80>]
                                                          ?
                                                          wake_bit_function+0x0/0x50

                                                          Jun  2
                                                          15:23:14
                                                          gqac006
                                                          kernel:
                                                          [<ffffffff8113a525>]
                                                          ?
                                                          pagevec_lookup_tag+0x25/0x40

                                                          Jun  2
                                                          15:23:14
                                                          gqac006
                                                          kernel:
                                                          [<ffffffff8112496b>]
wait_on_page_writeback_range+0xfb/0x190

                                                          Jun  2
                                                          15:23:14
                                                          gqac006
                                                          kernel:
                                                          [<ffffffff81124b38>]
filemap_write_and_wait_range+0x78/0x90

                                                          Jun  2
                                                          15:23:14
                                                          gqac006
                                                          kernel:
                                                          [<ffffffff811c07ce>]
vfs_fsync_range+0x7e/0x100

                                                          Jun  2
                                                          15:23:14
                                                          gqac006
                                                          kernel:
                                                          [<ffffffff811c08bd>]
vfs_fsync+0x1d/0x20

                                                          Jun  2
                                                          15:23:14
                                                          gqac006
                                                          kernel:
                                                          [<ffffffff811c08fe>]
do_fsync+0x3e/0x60

                                                          Jun  2
                                                          15:23:14
                                                          gqac006
                                                          kernel:
                                                          [<ffffffff811c0950>]
sys_fsync+0x10/0x20

                                                          Jun  2
                                                          15:23:14
                                                          gqac006
                                                          kernel:
                                                          [<ffffffff8100b072>]
system_call_fastpath+0x16/0x1b

                                                          Do you see a
                                                          perf problem
                                                          with just a
                                                          simple DD or
                                                          do you need a
                                                          more complex
                                                          workload to
                                                          hit the issue?
                                                           I think I saw
                                                          an issue with
                                                          metadata
                                                          performance
                                                          that I am
                                                          trying to run
                                                          down, let me
                                                          know if you
                                                          can see the
                                                          problem with
                                                          simple DD
                                                          reads / writes
                                                          or if we need
                                                          to do some
                                                          sort of dir /
                                                          metadata
                                                          access as
                                                          well.

                                                          -b

                                                          ----- Original
                                                          Message -----

                                                          From:
                                                          "Geoffrey
                                                          Letessier"
                                                          <geoffrey.letessier@xxxxxxx>

                                                          To: "Pranith
                                                          Kumar
                                                          Karampuri"
                                                          <pkarampu@xxxxxxxxxx>

                                                          Cc: gluster-users@xxxxxxxxxxx

                                                          Sent: Tuesday,
                                                          June 2, 2015
                                                          8:09:04 AM

                                                          Subject: Re:

                                                          GlusterFS 3.7
                                                          - slow/poor
                                                          performances

                                                          Hi Pranith,

                                                          I’m sorry but
                                                          I cannot bring
                                                          you any
                                                          comparison
                                                          because
                                                          comparison
                                                          will be

                                                          distorted by
                                                          the fact in my
                                                          HPC cluster in
                                                          production the
                                                          network
                                                          technology

                                                          is InfiniBand
                                                          QDR and my
                                                          volumes are
                                                          quite
                                                          different
                                                          (brick in
                                                          RAID6

                                                          (12x2TB), 2
                                                          bricks per
                                                          server and 4
                                                          servers into
                                                          my pool)

                                                          Concerning
                                                          your demand,
                                                          in attachments
                                                          you can find
                                                          all expected
                                                          results

                                                          hoping it can
                                                          help you to
                                                          solve this
                                                          serious
                                                          performance
                                                          issue (maybe I
                                                          need

                                                          play with
                                                          glusterfs
                                                          parameters?).

                                                          Thank you very
                                                          much by
                                                          advance,

                                                          Geoffrey

------------------------------------------------------

                                                          Geoffrey
                                                          Letessier

                                                          Responsable
                                                          informatique
                                                          &
                                                          ingénieur
                                                          système

                                                          UPR 9080 -
                                                          CNRS -
                                                          Laboratoire de
                                                          Biochimie
                                                          Théorique

                                                          Institut de
                                                          Biologie
                                                          Physico-Chimique

                                                          13, rue Pierre
                                                          et Marie Curie
                                                          - 75005 Paris

                                                          Tel: 01 58 41
                                                          50 93 - eMail:
                                                          geoffrey.letessier@xxxxxxx

                                                          Le 2 juin 2015
                                                          à 10:09,
                                                          Pranith Kumar
                                                          Karampuri <
                                                          pkarampu@xxxxxxxxxx >
                                                          a

                                                          écrit :

                                                          hi Geoffrey,

                                                          Since you are
                                                          saying it
                                                          happens on all
                                                          types of
                                                          volumes, lets
                                                          do the

                                                          following:

                                                          1) Create a
                                                          dist-repl
                                                          volume

                                                          2) Set the
                                                          options etc
                                                          you need.

                                                          3) enable
                                                          gluster volume
                                                          profile using
                                                          "gluster
                                                          volume profile
<volname>

                                                          start"

                                                          4) run the
                                                          work load

                                                          5) give output
                                                          of "gluster
                                                          volume profile
                                                          <volname>

                                                          info"

                                                          Repeat the
                                                          steps above on
                                                          new and old
                                                          version you
                                                          are comparing
                                                          this with.

                                                          That should
                                                          give us
                                                          insight into
                                                          what could be
                                                          causing the
                                                          slowness.

                                                          Pranith

                                                          On 06/02/2015
                                                          03:22 AM,
                                                          Geoffrey
                                                          Letessier
                                                          wrote:

                                                          Dear all,

                                                          I have a crash
                                                          test cluster
                                                          where i’ve
                                                          tested the new
                                                          version of
                                                          GlusterFS

                                                          (v3.7) before
                                                          upgrading my
                                                          HPC cluster in
                                                          production.

                                                          But… all my
                                                          tests show me
                                                          very very low
                                                          performances.

                                                          For my
                                                          benches, as
                                                          you can read
                                                          below, I do
                                                          some actions
                                                          (untar, du,
                                                          find,

                                                          tar, rm) with
                                                          linux kernel
                                                          sources,
                                                          dropping
                                                          cache, each on
                                                          distributed,

                                                          replicated,
                                                          distributed-replicated,
                                                          single (single
                                                          brick) volumes
                                                          and the

                                                          native FS of
                                                          one brick.

                                                          # time (echo 3
                                                          >
                                                          /proc/sys/vm/drop_caches;
                                                          tar xJf
                                                          ~/linux-4.1-rc5.tar.xz;

                                                          sync; echo 3
                                                          >
                                                          /proc/sys/vm/drop_caches)

                                                          # time (echo 3
                                                          >
                                                          /proc/sys/vm/drop_caches;
                                                          du -sh
                                                          linux-4.1-rc5/;
                                                          echo 3 >

/proc/sys/vm/drop_caches)

                                                          # time (echo 3
                                                          >
                                                          /proc/sys/vm/drop_caches;
                                                          find
                                                          linux-4.1-rc5/|wc
                                                          -l; echo 3

                                                          /proc/sys/vm/drop_caches)

                                                          # time (echo 3
                                                          >
                                                          /proc/sys/vm/drop_caches;
                                                          tar czf
                                                          linux-4.1-rc5.tgz

                                                          linux-4.1-rc5/;

                                                          echo 3 >
                                                          /proc/sys/vm/drop_caches)

                                                          # time (echo 3
                                                          >
                                                          /proc/sys/vm/drop_caches;
                                                          rm -rf
                                                          linux-4.1-rc5.tgz

                                                          linux-4.1-rc5/;

                                                          echo 3 >
                                                          /proc/sys/vm/drop_caches)

                                                          And here are
                                                          the process
                                                          times:

---------------------------------------------------------------

                                                          | | UNTAR | DU
                                                          | FIND | TAR |
                                                          RM |

---------------------------------------------------------------

                                                          | single |
                                                          ~3m45s | ~43s
                                                          | ~47s |
                                                          ~3m10s |
                                                          ~3m15s |

---------------------------------------------------------------

                                                          | replicated |
                                                          ~5m10s | ~59s
                                                          | ~1m6s |
                                                          ~1m19s |
                                                          ~1m49s |

---------------------------------------------------------------

                                                          | distributed
                                                          | ~4m18s |
                                                          ~41s | ~57s |
                                                          ~2m24s |
                                                          ~1m38s |

---------------------------------------------------------------

                                                          | dist-repl |
                                                          ~8m18s | ~1m4s
                                                          | ~1m11s |
                                                          ~1m24s |
                                                          ~2m40s |

---------------------------------------------------------------

                                                          | native FS |
                                                          ~11s | ~4s |
                                                          ~2s | ~56s |
                                                          ~10s |

---------------------------------------------------------------

                                                          I get the same
                                                          results,
                                                          whether with
                                                          default
                                                          configurations
                                                          with custom

configurations.

                                                          if I look at
                                                          the side of
                                                          the ifstat
                                                          command, I can
                                                          note my IO
                                                          write
                                                          processes

                                                          never exceed
                                                          3MBs...

                                                          EXT4 native FS
                                                          seems to be
                                                          faster
                                                          (roughly
                                                          15-20% but no
                                                          more) than XFS
                                                          one

                                                          My [test]
                                                          storage
                                                          cluster config
                                                          is composed by
                                                          2 identical
                                                          servers (biCPU

                                                          Intel Xeon
                                                          X5355, 8GB of
                                                          RAM, 2x2TB HDD
                                                          (no-RAID) and
                                                          Gb ethernet)

                                                          My volume
                                                          settings:

                                                          single:
                                                          1server 1
                                                          brick

                                                          replicated: 2
                                                          servers 1
                                                          brick each

                                                          distributed: 2
                                                          servers 2
                                                          bricks each

                                                          dist-repl: 2
                                                          bricks in the
                                                          same server
                                                          and replica 2

                                                          All seems to
                                                          be OK in
                                                          gluster status
                                                          command line.

                                                          Do you have an
                                                          idea why I
                                                          obtain so bad
                                                          results?

                                                          Thanks in
                                                          advance.

                                                          Geoffrey

-----------------------------------------------

                                                          Geoffrey
                                                          Letessier

                                                          Responsable
                                                          informatique
                                                          &
                                                          ingénieur
                                                          système

                                                          CNRS - UPR
                                                          9080 -
                                                          Laboratoire de
                                                          Biochimie
                                                          Théorique

                                                          Institut de
                                                          Biologie
                                                          Physico-Chimique

                                                          13, rue Pierre
                                                          et Marie Curie
                                                          - 75005 Paris

                                                          Tel: 01 58 41
                                                          50 93 - eMail:
                                                          geoffrey.letessier@xxxxxxx

_______________________________________________

                                                          Gluster-users
                                                          mailing list Gluster-users@xxxxxxxxxxx

                                                          http://www.gluster.org/mailman/listinfo/gluster-users

_______________________________________________

                                                          Gluster-users
                                                          mailing list

                                                          Gluster-users@xxxxxxxxxxx

                                                          http://www.gluster.org/mailman/listinfo/gluster-users

                                          _______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

            <quota-verify.gz>

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users