Re: Fuse memleaks, all versions

Yannick Perret <yannick.perret@xxxxxxxxxxxxx> · Tue, 2 Aug 2016 17:00:05 +0200



    So here are the dumps, gzip'ed.

      
      What I did:

      1. mounting the volume, removing all its content, umounting it

      2. mounting the volume

      3. performing a cp -Rp /usr/* /root/MNT

      4. performing a rm -rf /root/MNT/*

      5. taking a dump (glusterdump.p1.dump)

      6. re-doing 3, 4 and 5 (glusterdump.p2.dump)

      
      VSZ/RSS are respectively:

      - 381896 / 35688 just after mount

      - 644040 / 309240 after 1st cp -Rp

      - 644040 / 310128 after 1st rm -rf

      - 709576 / 310128 after 1st kill -USR1

      - 840648 / 421964 after 2nd cp -Rp

      - 840648 / 422224 after 2nd rm -rf

      
      I created a small script that performs these actions in an
      infinite loop:

      while /bin/true

      do

        cp -Rp /usr/* /root/MNT/

        + get VSZ/RSS of glusterfs process

        rm -rf /root/MNT/*

        + get VSZ/RSS of glusterfs process

      done

      
      At this time here are the values so far:

      971720 533988

      1037256 645500

      1037256 645840

      1168328 757348

      1168328 757620

      1299400 869128

      1299400 869328

      1364936 980712

      1364936 980944

      1496008 1092384

      1496008 1092404

      1627080 1203796

      1627080 1203996

      1692616 1315572

      1692616 1315504

      1823688 1426812

      1823688 1427340

      1954760 1538716

      1954760 1538772

      2085832 1647676

      2085832 1647708

      2151368 1750392

      2151368 1750708

      2282440 1853864

      2282440 1853764

      2413512 1952668

      2413512 1952704

      2479048 2056500

      2479048 2056712

      
      So at this time glusterfs process takes not far from 2Gb of
      resident memory, only performing exactly the same actions 'cp -Rp
      /usr/* /root/MNT' + 'rm -rf /root/MNT/*'.

      
      Swap usage is starting to increase a little, and I don't saw any
      memory dropping at this time.

      I can understand that kernel may not release the removed files
      (after rm -rf) immediatly, but the fist 'rm' occured at ~12:00
      today and it is ~17:00 here so I can't understand why so much
      memory is used.

      I would expect the memory to grow during 'cp -Rp', then reduce
      after 'rm', but it stays the same. Even if it stays the same I
      would expect it to not grow more while cp-ing again.

      
      I let the cp/rm loop running to see what will happen. Feel free to
      ask for other data if it may help.

      
      Please note that I'll be in hollidays at the end of this week for
      3 weeks so I will mostly not be able to perform tests during this
      time (network connection is too bad where I go).

      
      Regards,

      --

      Y.

      
      Le 02/08/2016 à 05:11, Pranith Kumar Karampuri a écrit :

    
          On Mon, Aug 1, 2016 at 3:40 PM,
            Yannick Perret <yannick.perret@xxxxxxxxxxxxx>
            wrote:

            
                  Le 29/07/2016 à 18:39, Pranith Kumar Karampuri a
                    écrit :

                  
                      On Fri,
                          Jul 29, 2016 at 2:26 PM, Yannick Perret <yannick.perret@xxxxxxxxxxxxx>
                          wrote:

                        
                        Ok, last try:

                            after investigating more versions I found
                            that FUSE client leaks memory on all of
                            them.

                            I tested:

                            - 3.6.7 client on debian 7 32bit and on
                            debian 8 64bit (with 3.6.7 serveurs on
                            debian 8 64bit)

                            - 3.6.9 client on debian 7 32bit and on
                            debian 8 64bit (with 3.6.7 serveurs on
                            debian 8 64bit)

                           - 3.7.13 client on
                            debian 8 64bit (with 3.8.1 serveurs on
                            debian 8 64bit)

                            - 3.8.1 client on debian 8 64bit (with 3.8.1
                            serveurs on debian 8 64bit)

                            In all cases compiled from sources, appart
                            for 3.8.1 where .deb were used (due to a
                            configure runtime error).

                            For 3.7 it was compiled with
                            --disable-tiering. I also tried to compile
                            with --disable-fusermount (no change).

                            
                            In all of these cases the memory (resident
                            & virtual) of glusterfs process on
                            client grows on each activity and never
                            reach a max (and never reduce).

                            "Activity" for these tests is cp -Rp and ls
                            -lR.

                            The client I let grows the most overreached
                            ~4Go RAM. On smaller machines it ends by OOM
                            killer killing glusterfs process or
                            glusterfs dying due to allocation error.

                            
                            In 3.6 mem seems to grow continusly, whereas
                            in 3.8.1 it grows by "steps" (430400 ko →
                            629144 (~1min) → 762324 (~1min) → 827860…).

                            
                            All tests performed on a single test volume
                            used only by my test client. Volume in a
                            basic x2 replica. The only parameters I
                            changed on this volume (without any effect)
                            are diagnostics.client-log-level set to
                            ERROR and network.inode-lru-limit set to
                            1024.

                          
                          Could you attach statedumps of your runs?

                            The following link has steps to capture
                            this(https://gluster.readthedocs.io/en/latest/Troubleshooting/statedump/
                            ). We basically need to see what are the
                            memory types that are increasing. If you
                            could help find the issue, we can send the
                            fixes for your workload. There is a 3.8.2
                            release in around 10 days I think. We can
                            probably target this issue for that?

                          
                 Here are statedumps.

                  Steps:

                  1. mount -t glusterfs ldap1.my.domain:SHARE /root/MNT/
                  (here VSZ and RSS are 381896 35828)

                  2. take a dump with kill -USR1
                  <pid-of-glusterfs-process> (file
                  glusterdump.n1.dump.1470042769)

                  3. perform a 'ls -lR /root/MNT | wc -l' (btw result of
                  wc -l is 518396 :)) and a 'cp -Rp /usr/*
                  /root/MNT/boo' (VSZ/RSS are 1301536/711992 at end of
                  these operations)

                  4. take a dump with kill -USR1
                  <pid-of-glusterfs-process> (file
                  glusterdump.n2.dump.1470043929)

                  5. do 'cp -Rp * /root/MNT/toto/', so on an other
                  directory (VSZ/RSS are 1432608/909968 at end of this
                  operation)

                  6. take a dump with kill -USR1
                  <pid-of-glusterfs-process> (file
                  glusterdump.n3.dump.)

                
            Hey,

            
                  Thanks a lot for providing this information.
              Looking at these steps, I don't see any problem for the
              increase in memory. Both ls -lR and cp -Rp commands you
              did in the step-3 will add new inodes in memory which
              increase the memory. What happens is as long as the kernel
              thinks these inodes need to be in memory gluster keeps
              them in memory. Once kernel doesn't think the inode is
              necessary, it sends 'inode-forgets'. At this point the
              memory starts reducing. So it kind of depends on the
              memory pressure kernel is under. But you said it lead to
              OOM-killers on smaller machines which means there could be
              some leaks. Could you modify the steps as follows to check
              to confirm there are leaks? Please do this test on those
              smaller machines which lead to OOM-killers.

            
              Steps:

                1. mount -t glusterfs ldap1.my.domain:SHARE /root/MNT/
                (here VSZ and RSS are 381896 35828)

                2. perform a 'ls -lR /root/MNT | wc -l' (btw result of
                wc -l is 518396 :)) and a 'cp -Rp /usr/* /root/MNT/boo'
                (VSZ/RSS are 1301536/711992 at end of these operations)

                3. do 'cp -Rp * /root/MNT/toto/', so on an other
                directory (VSZ/RSS are 1432608/909968 at end of this
                operation)

              
             4. Delete all the files and directories
                you created in steps 2, 3 above

              
            5. Take statedump with kill -USR1
                <pid-of-glusterfs-process>

              
            6. Repeat steps from 2-5

                
            Attach these two statedumps. I think the
                statedumps will be even more affective if the mount does
                not have any data when you start the experiment.

              
            HTH

               
                 Dump files are gzip'ed because they are very
                large.

                Dump files are here (too big for email):

                http://wikisend.com/download/623430/glusterdump.n1.dump.1470042769.gz

                http://wikisend.com/download/771220/glusterdump.n2.dump.1470043929.gz

                http://wikisend.com/download/428752/glusterdump.n3.dump.1470045181.gz

                (I keep the files if someone whats them in an other
                format)

                  
                  Client and servers are installed from .deb files
                  (glusterfs-client_3.8.1-1_amd64.deb and
                  glusterfs-common_3.8.1-1_amd64.deb on client side).

                  They are all Debian 8 64bit. Servers are test machines
                  that serve only one volume to this sole client. Volume
                  is a simple x2 replica. I just changed for test
                  network.inode-lru-limit value to 1024. Mount point
                  /root/MNT is only used for these tests.

                  
                  --

                  Y.

                  
          -- 

          
            Pranith

            
Attachment:
glusterdump.p1.dump.gz

Description: application/gzip
Attachment:
glusterdump.p2.dump.gz

Description: application/gzip
Attachment:
smime.p7s

Description: Signature cryptographique S/MIME
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users