Re: Memory leak in GlusterFS FUSE client

Oleksandr Natalenko <oleksandr@xxxxxxxxxxxxxx> · Sat, 26 Dec 2015 01:05:55 +0200

OK, I've rebuild GlusterFS v3.7.6 with debug enabled as well as NFS-Ganesha 
with debug enabled as well (and libc allocator).

Here is my test steps:

1. launch nfs-ganesha:

valgrind --leak-check=full --show-leak-kinds=all --log-file="valgrind.log" /
opt/nfs-ganesha/bin/ganesha.nfsd -F -L ./ganesha.log -f ./ganesha.conf -N 
NIV_EVENT

2. mount NFS share:

mount -t nfs4 127.0.0.1:/share share -o 
defaults,_netdev,minorversion=2,noac,noacl,lookupcache=none,timeo=100

3. cd to share and run find . for some time

4. CTRL+C find, unmount share.

5. CTRL+C NFS-Ganesha.

Here is full valgrind output:

https://gist.github.com/eebd9f94ababd8130d49

One may see the probability of massive leaks at the end of valgrind output 
related to both GlusterFS and NFS-Ganesha code.

On пʼятниця, 25 грудня 2015 р. 23:29:07 EET Soumya Koduri wrote:
> On 12/25/2015 08:56 PM, Oleksandr Natalenko wrote:
> > What units Cache_Size is measured in? Bytes?
> 
> Its actually (Cache_Size * sizeof_ptr) bytes. If possible, could you
> please run ganesha process under valgrind? Will help in detecting leaks.
> 
> Thanks,
> Soumya
> 
> > 25.12.2015 16:58, Soumya Koduri написав:
> >> On 12/24/2015 09:17 PM, Oleksandr Natalenko wrote:
> >>> Another addition: it seems to be GlusterFS API library memory leak
> >>> because NFS-Ganesha also consumes huge amount of memory while doing
> >>> ordinary "find . -type f" via NFSv4.2 on remote client. Here is memory
> >>> usage:
> >>> 
> >>> ===
> >>> root      5416 34.2 78.5 2047176 1480552 ?     Ssl  12:02 117:54
> >>> /usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f
> >>> /etc/ganesha/ganesha.conf -N NIV_EVENT
> >>> ===
> >>> 
> >>> 1.4G is too much for simple stat() :(.
> >>> 
> >>> Ideas?
> >> 
> >> nfs-ganesha also has cache layer which can scale to millions of
> >> entries depending on the number of files/directories being looked
> >> upon. However there are parameters to tune it. So either try stat with
> >> few entries or add below block in nfs-ganesha.conf file, set low
> >> limits and check the difference. That may help us narrow down how much
> >> memory actually consumed by core nfs-ganesha and gfAPI.
> >> 
> >> CACHEINODE {
> >> 
> >>     Cache_Size(uint32, range 1 to UINT32_MAX, default 32633); # cache
> >> 
> >> size
> >> 
> >>     Entries_HWMark(uint32, range 1 to UINT32_MAX, default 100000); #Max
> >> 
> >> no. of entries in the cache.
> >> }
> >> 
> >> Thanks,
> >> Soumya
> >> 
> >>> 24.12.2015 16:32, Oleksandr Natalenko написав:
> >>>> Still actual issue for 3.7.6. Any suggestions?
> >>>> 
> >>>> 24.09.2015 10:14, Oleksandr Natalenko написав:
> >>>>> In our GlusterFS deployment we've encountered something like memory
> >>>>> leak in GlusterFS FUSE client.
> >>>>> 
> >>>>> We use replicated (×2) GlusterFS volume to store mail (exim+dovecot,
> >>>>> maildir format). Here is inode stats for both bricks and mountpoint:
> >>>>> 
> >>>>> ===
> >>>>> Brick 1 (Server 1):
> >>>>> 
> >>>>> Filesystem                                             Inodes IUsed
> >>>>> 
> >>>>>      IFree IUse% Mounted on
> >>>>> 
> >>>>> /dev/mapper/vg_vd1_misc-lv08_mail                   578768144 10954918
> >>>>> 
> >>>>>  567813226    2% /bricks/r6sdLV08_vd1_mail
> >>>>> 
> >>>>> Brick 2 (Server 2):
> >>>>> 
> >>>>> Filesystem                                             Inodes IUsed
> >>>>> 
> >>>>>      IFree IUse% Mounted on
> >>>>> 
> >>>>> /dev/mapper/vg_vd0_misc-lv07_mail                   578767984 10954913
> >>>>> 
> >>>>>  567813071    2% /bricks/r6sdLV07_vd0_mail
> >>>>> 
> >>>>> Mountpoint (Server 3):
> >>>>> 
> >>>>> Filesystem                              Inodes    IUsed      IFree
> >>>>> IUse% Mounted on
> >>>>> glusterfs.xxx:mail                   578767760 10954915  567812845
> >>>>> 2% /var/spool/mail/virtual
> >>>>> ===
> >>>>> 
> >>>>> glusterfs.xxx domain has two A records for both Server 1 and Server 2.
> >>>>> 
> >>>>> Here is volume info:
> >>>>> 
> >>>>> ===
> >>>>> Volume Name: mail
> >>>>> Type: Replicate
> >>>>> Volume ID: f564e85c-7aa6-4170-9417-1f501aa98cd2
> >>>>> Status: Started
> >>>>> Number of Bricks: 1 x 2 = 2
> >>>>> Transport-type: tcp
> >>>>> Bricks:
> >>>>> Brick1: server1.xxx:/bricks/r6sdLV08_vd1_mail/mail
> >>>>> Brick2: server2.xxx:/bricks/r6sdLV07_vd0_mail/mail
> >>>>> Options Reconfigured:
> >>>>> nfs.rpc-auth-allow: 1.2.4.0/24,4.5.6.0/24
> >>>>> features.cache-invalidation-timeout: 10
> >>>>> performance.stat-prefetch: off
> >>>>> performance.quick-read: on
> >>>>> performance.read-ahead: off
> >>>>> performance.flush-behind: on
> >>>>> performance.write-behind: on
> >>>>> performance.io-thread-count: 4
> >>>>> performance.cache-max-file-size: 1048576
> >>>>> performance.cache-size: 67108864
> >>>>> performance.readdir-ahead: off
> >>>>> ===
> >>>>> 
> >>>>> Soon enough after mounting and exim/dovecot start, glusterfs client
> >>>>> process begins to consume huge amount of RAM:
> >>>>> 
> >>>>> ===
> >>>>> user@server3 ~$ ps aux | grep glusterfs | grep mail
> >>>>> root     28895 14.4 15.0 15510324 14908868 ?   Ssl  Sep03 4310:05
> >>>>> /usr/sbin/glusterfs --fopen-keep-cache --direct-io-mode=disable
> >>>>> --volfile-server=glusterfs.xxx --volfile-id=mail
> >>>>> /var/spool/mail/virtual
> >>>>> ===
> >>>>> 
> >>>>> That is, ~15 GiB of RAM.
> >>>>> 
> >>>>> Also we've tried to use mountpoint withing separate KVM VM with 2 or 3
> >>>>> GiB of RAM, and soon after starting mail daemons got OOM killer for
> >>>>> glusterfs client process.
> >>>>> 
> >>>>> Mounting same share via NFS works just fine. Also, we have much less
> >>>>> iowait and loadavg on client side with NFS.
> >>>>> 
> >>>>> Also, we've tried to change IO threads count and cache size in order
> >>>>> to limit memory usage with no luck. As you can see, total cache size
> >>>>> is 4×64==256 MiB (compare to 15 GiB).
> >>>>> 
> >>>>> Enabling-disabling stat-prefetch, read-ahead and readdir-ahead didn't
> >>>>> help as well.
> >>>>> 
> >>>>> Here are volume memory stats:
> >>>>> 
> >>>>> ===
> >>>>> Memory status for volume : mail
> >>>>> ----------------------------------------------
> >>>>> Brick : server1.xxx:/bricks/r6sdLV08_vd1_mail/mail
> >>>>> Mallinfo
> >>>>> --------
> >>>>> Arena    : 36859904
> >>>>> Ordblks  : 10357
> >>>>> Smblks   : 519
> >>>>> Hblks    : 21
> >>>>> Hblkhd   : 30515200
> >>>>> Usmblks  : 0
> >>>>> Fsmblks  : 53440
> >>>>> Uordblks : 18604144
> >>>>> Fordblks : 18255760
> >>>>> Keepcost : 114112
> >>>>> 
> >>>>> Mempool Stats
> >>>>> -------------
> >>>>> Name                            HotCount ColdCount PaddedSizeof
> >>>>> AllocCount MaxAlloc   Misses Max-StdAlloc
> >>>>> ----                            -------- --------- ------------
> >>>>> ---------- -------- -------- ------------
> >>>>> mail-server:fd_t                       0      1024          108
> >>>>> 30773120      137        0            0
> >>>>> mail-server:dentry_t               16110       274           84
> >>>>> 235676148    16384  1106499         1152
> >>>>> mail-server:inode_t                16363        21          156
> >>>>> 237216876    16384  1876651         1169
> >>>>> mail-trash:fd_t                        0      1024          108
> >>>>> 
> >>>>>   0        0        0            0
> >>>>> 
> >>>>> mail-trash:dentry_t                    0     32768           84
> >>>>> 
> >>>>>   0        0        0            0
> >>>>> 
> >>>>> mail-trash:inode_t                     4     32764          156
> >>>>> 
> >>>>>   4        4        0            0
> >>>>> 
> >>>>> mail-trash:trash_local_t               0        64         8628
> >>>>> 
> >>>>>   0        0        0            0
> >>>>> 
> >>>>> mail-changetimerecorder:gf_ctr_local_t         0        64
> >>>>> 16540          0        0        0            0
> >>>>> mail-changelog:rpcsvc_request_t         0         8         2828
> >>>>> 
> >>>>>    0        0        0            0
> >>>>> 
> >>>>> mail-changelog:changelog_local_t         0        64          116
> >>>>> 
> >>>>>     0        0        0            0
> >>>>> 
> >>>>> mail-bitrot-stub:br_stub_local_t         0       512           84
> >>>>> 79204        4        0            0
> >>>>> mail-locks:pl_local_t                  0        32          148
> >>>>> 6812757        4        0            0
> >>>>> mail-upcall:upcall_local_t             0       512          108
> >>>>> 
> >>>>>   0        0        0            0
> >>>>> 
> >>>>> mail-marker:marker_local_t             0       128          332
> >>>>> 64980        3        0            0
> >>>>> mail-quota:quota_local_t               0        64          476
> >>>>> 
> >>>>>   0        0        0            0
> >>>>> 
> >>>>> mail-server:rpcsvc_request_t           0       512         2828
> >>>>> 45462533       34        0            0
> >>>>> glusterfs:struct saved_frame           0         8          124
> >>>>> 
> >>>>>   2        2        0            0
> >>>>> 
> >>>>> glusterfs:struct rpc_req               0         8          588
> >>>>> 
> >>>>>   2        2        0            0
> >>>>> 
> >>>>> glusterfs:rpcsvc_request_t             1         7         2828
> >>>>> 
> >>>>>   2        1        0            0
> >>>>> 
> >>>>> glusterfs:log_buf_t                    5       251          140
> >>>>> 3452        6        0            0
> >>>>> glusterfs:data_t                     242     16141           52
> >>>>> 480115498      664        0            0
> >>>>> glusterfs:data_pair_t                230     16153           68
> >>>>> 179483528      275        0            0
> >>>>> glusterfs:dict_t                      23      4073          140
> >>>>> 303751675      627        0            0
> >>>>> glusterfs:call_stub_t                  0      1024         3764
> >>>>> 45290655       34        0            0
> >>>>> glusterfs:call_stack_t                 1      1023         1708
> >>>>> 43598469       34        0            0
> >>>>> glusterfs:call_frame_t                 1      4095          172
> >>>>> 336219655      184        0            0
> >>>>> ----------------------------------------------
> >>>>> Brick : server2.xxx:/bricks/r6sdLV07_vd0_mail/mail
> >>>>> Mallinfo
> >>>>> --------
> >>>>> Arena    : 38174720
> >>>>> Ordblks  : 9041
> >>>>> Smblks   : 507
> >>>>> Hblks    : 21
> >>>>> Hblkhd   : 30515200
> >>>>> Usmblks  : 0
> >>>>> Fsmblks  : 51712
> >>>>> Uordblks : 19415008
> >>>>> Fordblks : 18759712
> >>>>> Keepcost : 114848
> >>>>> 
> >>>>> Mempool Stats
> >>>>> -------------
> >>>>> Name                            HotCount ColdCount PaddedSizeof
> >>>>> AllocCount MaxAlloc   Misses Max-StdAlloc
> >>>>> ----                            -------- --------- ------------
> >>>>> ---------- -------- -------- ------------
> >>>>> mail-server:fd_t                       0      1024          108
> >>>>> 2373075      133        0            0
> >>>>> mail-server:dentry_t               14114      2270           84
> >>>>> 3513654    16384     2300          267
> >>>>> mail-server:inode_t                16374        10          156
> >>>>> 6766642    16384   194635         1279
> >>>>> mail-trash:fd_t                        0      1024          108
> >>>>> 
> >>>>>   0        0        0            0
> >>>>> 
> >>>>> mail-trash:dentry_t                    0     32768           84
> >>>>> 
> >>>>>   0        0        0            0
> >>>>> 
> >>>>> mail-trash:inode_t                     4     32764          156
> >>>>> 
> >>>>>   4        4        0            0
> >>>>> 
> >>>>> mail-trash:trash_local_t               0        64         8628
> >>>>> 
> >>>>>   0        0        0            0
> >>>>> 
> >>>>> mail-changetimerecorder:gf_ctr_local_t         0        64
> >>>>> 16540          0        0        0            0
> >>>>> mail-changelog:rpcsvc_request_t         0         8         2828
> >>>>> 
> >>>>>    0        0        0            0
> >>>>> 
> >>>>> mail-changelog:changelog_local_t         0        64          116
> >>>>> 
> >>>>>     0        0        0            0
> >>>>> 
> >>>>> mail-bitrot-stub:br_stub_local_t         0       512           84
> >>>>> 71354        4        0            0
> >>>>> mail-locks:pl_local_t                  0        32          148
> >>>>> 8135032        4        0            0
> >>>>> mail-upcall:upcall_local_t             0       512          108
> >>>>> 
> >>>>>   0        0        0            0
> >>>>> 
> >>>>> mail-marker:marker_local_t             0       128          332
> >>>>> 65005        3        0            0
> >>>>> mail-quota:quota_local_t               0        64          476
> >>>>> 
> >>>>>   0        0        0            0
> >>>>> 
> >>>>> mail-server:rpcsvc_request_t           0       512         2828
> >>>>> 12882393       30        0            0
> >>>>> glusterfs:struct saved_frame           0         8          124
> >>>>> 
> >>>>>   2        2        0            0
> >>>>> 
> >>>>> glusterfs:struct rpc_req               0         8          588
> >>>>> 
> >>>>>   2        2        0            0
> >>>>> 
> >>>>> glusterfs:rpcsvc_request_t             1         7         2828
> >>>>> 
> >>>>>   2        1        0            0
> >>>>> 
> >>>>> glusterfs:log_buf_t                    5       251          140
> >>>>> 3443        6        0            0
> >>>>> glusterfs:data_t                     242     16141           52
> >>>>> 138743429      290        0            0
> >>>>> glusterfs:data_pair_t                230     16153           68
> >>>>> 126649864      270        0            0
> >>>>> glusterfs:dict_t                      23      4073          140
> >>>>> 20356289       63        0            0
> >>>>> glusterfs:call_stub_t                  0      1024         3764
> >>>>> 13678560       31        0            0
> >>>>> glusterfs:call_stack_t                 1      1023         1708
> >>>>> 11011561       30        0            0
> >>>>> glusterfs:call_frame_t                 1      4095          172
> >>>>> 125764190      193        0            0
> >>>>> ----------------------------------------------
> >>>>> ===
> >>>>> 
> >>>>> So, my questions are:
> >>>>> 
> >>>>> 1) what one should do to limit GlusterFS FUSE client memory usage?
> >>>>> 2) what one should do to prevent client high loadavg because of high
> >>>>> iowait because of multiple concurrent volume users?
> >>>>> 
> >>>>> Server/client OS is CentOS 7.1, GlusterFS server version is 3.7.3,
> >>>>> GlusterFS client version is 3.7.4.
> >>>>> 
> >>>>> Any additional info needed?
> >>> 
> >>> _______________________________________________
> >>> Gluster-users mailing list
> >>> Gluster-users@xxxxxxxxxxx
> >>> http://www.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users