Still actual issue for 3.7.6. Any suggestions? 24.09.2015 10:14, Oleksandr Natalenko написав:
In our GlusterFS deployment we've encountered something like memory leak in GlusterFS FUSE client. We use replicated (×2) GlusterFS volume to store mail (exim+dovecot, maildir format). Here is inode stats for both bricks and mountpoint: === Brick 1 (Server 1): Filesystem Inodes IUsed IFree IUse% Mounted on /dev/mapper/vg_vd1_misc-lv08_mail 578768144 10954918 567813226 2% /bricks/r6sdLV08_vd1_mail Brick 2 (Server 2): Filesystem Inodes IUsed IFree IUse% Mounted on /dev/mapper/vg_vd0_misc-lv07_mail 578767984 10954913 567813071 2% /bricks/r6sdLV07_vd0_mail Mountpoint (Server 3): Filesystem Inodes IUsed IFree IUse% Mounted on glusterfs.xxx:mail 578767760 10954915 567812845 2% /var/spool/mail/virtual === glusterfs.xxx domain has two A records for both Server 1 and Server 2. Here is volume info: === Volume Name: mail Type: Replicate Volume ID: f564e85c-7aa6-4170-9417-1f501aa98cd2 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: server1.xxx:/bricks/r6sdLV08_vd1_mail/mail Brick2: server2.xxx:/bricks/r6sdLV07_vd0_mail/mail Options Reconfigured: nfs.rpc-auth-allow: 1.2.4.0/24,4.5.6.0/24 features.cache-invalidation-timeout: 10 performance.stat-prefetch: off performance.quick-read: on performance.read-ahead: off performance.flush-behind: on performance.write-behind: on performance.io-thread-count: 4 performance.cache-max-file-size: 1048576 performance.cache-size: 67108864 performance.readdir-ahead: off === Soon enough after mounting and exim/dovecot start, glusterfs client process begins to consume huge amount of RAM: === user@server3 ~$ ps aux | grep glusterfs | grep mail root 28895 14.4 15.0 15510324 14908868 ? Ssl Sep03 4310:05 /usr/sbin/glusterfs --fopen-keep-cache --direct-io-mode=disable --volfile-server=glusterfs.xxx --volfile-id=mail /var/spool/mail/virtual === That is, ~15 GiB of RAM. Also we've tried to use mountpoint withing separate KVM VM with 2 or 3 GiB of RAM, and soon after starting mail daemons got OOM killer for glusterfs client process. Mounting same share via NFS works just fine. Also, we have much less iowait and loadavg on client side with NFS. Also, we've tried to change IO threads count and cache size in order to limit memory usage with no luck. As you can see, total cache size is 4×64==256 MiB (compare to 15 GiB). Enabling-disabling stat-prefetch, read-ahead and readdir-ahead didn't help as well. Here are volume memory stats: === Memory status for volume : mail ---------------------------------------------- Brick : server1.xxx:/bricks/r6sdLV08_vd1_mail/mail Mallinfo -------- Arena : 36859904 Ordblks : 10357 Smblks : 519 Hblks : 21 Hblkhd : 30515200 Usmblks : 0 Fsmblks : 53440 Uordblks : 18604144 Fordblks : 18255760 Keepcost : 114112 Mempool Stats ------------- Name HotCount ColdCount PaddedSizeof AllocCount MaxAlloc Misses Max-StdAlloc ---- -------- --------- ------------ ---------- -------- -------- ------------ mail-server:fd_t 0 1024 108 30773120 137 0 0 mail-server:dentry_t 16110 274 84 235676148 16384 1106499 1152 mail-server:inode_t 16363 21 156 237216876 16384 1876651 1169 mail-trash:fd_t 0 1024 108 0 0 0 0 mail-trash:dentry_t 0 32768 84 0 0 0 0 mail-trash:inode_t 4 32764 156 4 4 0 0 mail-trash:trash_local_t 0 64 8628 0 0 0 0 mail-changetimerecorder:gf_ctr_local_t 0 64 16540 0 0 0 0 mail-changelog:rpcsvc_request_t 0 8 2828 0 0 0 0 mail-changelog:changelog_local_t 0 64 116 0 0 0 0 mail-bitrot-stub:br_stub_local_t 0 512 84 79204 4 0 0 mail-locks:pl_local_t 0 32 148 6812757 4 0 0 mail-upcall:upcall_local_t 0 512 108 0 0 0 0 mail-marker:marker_local_t 0 128 332 64980 3 0 0 mail-quota:quota_local_t 0 64 476 0 0 0 0 mail-server:rpcsvc_request_t 0 512 2828 45462533 34 0 0 glusterfs:struct saved_frame 0 8 124 2 2 0 0 glusterfs:struct rpc_req 0 8 588 2 2 0 0 glusterfs:rpcsvc_request_t 1 7 2828 2 1 0 0 glusterfs:log_buf_t 5 251 140 3452 6 0 0 glusterfs:data_t 242 16141 52 480115498 664 0 0 glusterfs:data_pair_t 230 16153 68 179483528 275 0 0 glusterfs:dict_t 23 4073 140 303751675 627 0 0 glusterfs:call_stub_t 0 1024 3764 45290655 34 0 0 glusterfs:call_stack_t 1 1023 1708 43598469 34 0 0 glusterfs:call_frame_t 1 4095 172 336219655 184 0 0 ---------------------------------------------- Brick : server2.xxx:/bricks/r6sdLV07_vd0_mail/mail Mallinfo -------- Arena : 38174720 Ordblks : 9041 Smblks : 507 Hblks : 21 Hblkhd : 30515200 Usmblks : 0 Fsmblks : 51712 Uordblks : 19415008 Fordblks : 18759712 Keepcost : 114848 Mempool Stats ------------- Name HotCount ColdCount PaddedSizeof AllocCount MaxAlloc Misses Max-StdAlloc ---- -------- --------- ------------ ---------- -------- -------- ------------ mail-server:fd_t 0 1024 108 2373075 133 0 0 mail-server:dentry_t 14114 2270 84 3513654 16384 2300 267 mail-server:inode_t 16374 10 156 6766642 16384 194635 1279 mail-trash:fd_t 0 1024 108 0 0 0 0 mail-trash:dentry_t 0 32768 84 0 0 0 0 mail-trash:inode_t 4 32764 156 4 4 0 0 mail-trash:trash_local_t 0 64 8628 0 0 0 0 mail-changetimerecorder:gf_ctr_local_t 0 64 16540 0 0 0 0 mail-changelog:rpcsvc_request_t 0 8 2828 0 0 0 0 mail-changelog:changelog_local_t 0 64 116 0 0 0 0 mail-bitrot-stub:br_stub_local_t 0 512 84 71354 4 0 0 mail-locks:pl_local_t 0 32 148 8135032 4 0 0 mail-upcall:upcall_local_t 0 512 108 0 0 0 0 mail-marker:marker_local_t 0 128 332 65005 3 0 0 mail-quota:quota_local_t 0 64 476 0 0 0 0 mail-server:rpcsvc_request_t 0 512 2828 12882393 30 0 0 glusterfs:struct saved_frame 0 8 124 2 2 0 0 glusterfs:struct rpc_req 0 8 588 2 2 0 0 glusterfs:rpcsvc_request_t 1 7 2828 2 1 0 0 glusterfs:log_buf_t 5 251 140 3443 6 0 0 glusterfs:data_t 242 16141 52 138743429 290 0 0 glusterfs:data_pair_t 230 16153 68 126649864 270 0 0 glusterfs:dict_t 23 4073 140 20356289 63 0 0 glusterfs:call_stub_t 0 1024 3764 13678560 31 0 0 glusterfs:call_stack_t 1 1023 1708 11011561 30 0 0 glusterfs:call_frame_t 1 4095 172 125764190 193 0 0 ---------------------------------------------- === So, my questions are: 1) what one should do to limit GlusterFS FUSE client memory usage? 2) what one should do to prevent client high loadavg because of high iowait because of multiple concurrent volume users? Server/client OS is CentOS 7.1, GlusterFS server version is 3.7.3, GlusterFS client version is 3.7.4. Any additional info needed?
_______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel