Aw: Re: [External] : nfsd: memory leak when client does many file operations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I never tried, but I removed async and noatime and started a test run on the VM now.
The result will take some time, as written below.

> Gesendet: Sonntag, den 24.03.2024 um 23:13 Uhr
> Von: "Chuck Lever III" <chuck.lever@xxxxxxxxxx>
> An: "Jan Schunk" <scpcom@xxxxxx>
> Cc: "Jeff Layton" <jlayton@xxxxxxxxxx>, "Neil Brown" <neilb@xxxxxxx>, "Olga Kornievskaia" <kolga@xxxxxxxxxx>, "Dai Ngo" <dai.ngo@xxxxxxxxxx>, "Tom Talpey" <tom@xxxxxxxxxx>, "Linux NFS Mailing List" <linux-nfs@xxxxxxxxxxxxxxx>, "linux-kernel@xxxxxxxxxxxxxxx" <linux-kernel@xxxxxxxxxxxxxxx>
> Betreff: Re: [External] : nfsd: memory leak when client does many file operations
> 
> 
> 
> > On Mar 24, 2024, at 5:39 PM, Jan Schunk <scpcom@xxxxxx> wrote:
> > 
> > Yes, the VM is x86_64.
> > 
> > "pgrep -c nfsd" says: 9
> > 
> > I use NFS version 3.
> > 
> > All network ports are connected with 1GBit/s.
> > 
> > The exported file system is ext4.
> > 
> > I do not use any authentication.
> > 
> > The mount options in /etc/fstab are:
> > rw,noatime,nfsvers=3,proto=tcp,hard,nointr,timeo=600,rsize=32768,wsize=32768,noauto
> > 
> > The line in /etc/exports:
> > /export/data3 192.168.0.0/16(fsid=<uuid>,rw,no_root_squash,async,no_subtree_check)
> 
> Is it possible to reproduce this issue without the "noatime"
> mount option and without the "async" export option?
> 
> 
> >> Gesendet: Sonntag, den 24.03.2024 um 22:10 Uhr
> >> Von: "Chuck Lever III" <chuck.lever@xxxxxxxxxx>
> >> An: "Jan Schunk" <scpcom@xxxxxx>
> >> Cc: "Jeff Layton" <jlayton@xxxxxxxxxx>, "Neil Brown" <neilb@xxxxxxx>, "Olga Kornievskaia" <kolga@xxxxxxxxxx>, "Dai Ngo" <dai.ngo@xxxxxxxxxx>, "Tom Talpey" <tom@xxxxxxxxxx>, "Linux NFS Mailing List" <linux-nfs@xxxxxxxxxxxxxxx>, "linux-kernel@xxxxxxxxxxxxxxx" <linux-kernel@xxxxxxxxxxxxxxx>
> >> Betreff: Re: [External] : nfsd: memory leak when client does many file operations
> >> 
> >> 
> >>> On Mar 24, 2024, at 4:48 PM, Jan Schunk <scpcom@xxxxxx> wrote:
> >>> 
> >>> The "heavy usage" is a simple script runinng on the client and does the following:
> >>> 1. Create a empty git repository on the share
> >>> 2. Unpacking a tar.gz archive (Qnap GPL source code)
> >>> 3. Remove some folders/files
> >>> 4. Use diff to compare it with an older version
> >>> 5. commit them to the git
> >>> 6. Repeat at step 2 with next archive
> >>> 
> >>> On my armhf NAS the other memory consuming workload is an SMB server.
> >> 
> >> I'm not sure any of us has a Freescale system to try this ...
> >> 
> >> 
> >>> On the test VM the other memory consuming workload is a GNOME desktop.
> >> 
> >> ... and so I'm hoping this VM is an x86_64 system.
> >> 
> >> 
> >>> But it does not make much difference if I stop other services it just takes a bit longer until the same issue happens.
> >>> The size of swap also does not make a difference.
> >> 
> >> What is the nfsd thread count on the server? 'pgrep -c nfsd'
> >> 
> >> What version of NFS does your client mount with?
> >> 
> >> What is the speed of the network between your client and server?
> >> 
> >> What is the type of the exported file system?
> >> 
> >> Do you use NFS with Kerberos?
> >> 
> >> 
> >>>> Gesendet: Sonntag, den 24.03.2024 um 21:14 Uhr
> >>>> Von: "Chuck Lever III" <chuck.lever@xxxxxxxxxx>
> >>>> An: "Jan Schunk" <scpcom@xxxxxx>
> >>>> Cc: "Jeff Layton" <jlayton@xxxxxxxxxx>, "Neil Brown" <neilb@xxxxxxx>, "Olga Kornievskaia" <kolga@xxxxxxxxxx>, "Dai Ngo" <dai.ngo@xxxxxxxxxx>, "Tom Talpey" <tom@xxxxxxxxxx>, "Linux NFS Mailing List" <linux-nfs@xxxxxxxxxxxxxxx>, "linux-kernel@xxxxxxxxxxxxxxx" <linux-kernel@xxxxxxxxxxxxxxx>
> >>>> Betreff: Re: [External] : nfsd: memory leak when client does many file operations
> >>>> 
> >>>> 
> >>>> 
> >>>>> On Mar 24, 2024, at 3:57 PM, Jan Schunk <scpcom@xxxxxx> wrote:
> >>>>> 
> >>>>> Issue found on: v6.5.13 v6.6.13, v6.6.14, v6.6.20 and v6.8.1
> >>>>> Not found on: v6.4, v6.1.82 and below
> >>>>> Architectures: amd64 and arm(hf)
> >>>>> 
> >>>>> Steps to reproduce:
> >>>>> - Create a VM with 1GB RAM
> >>>>> - Install Debian 12
> >>>>> - Install linux-image-6.6.13+bpo-amd64-unsigned and nfs-kernel-server
> >>>>> - Export some folder
> >>>>> On the client:
> >>>>> - Mount the share
> >>>>> - Run a script that does produce heavy usage on the share (like unpacking large tar archives that cointain many small files into a git and commiting them)
> >>>> 
> >>>> Hi Jan, thanks for the report.
> >>>> 
> >>>> The "produce heavy usage" instruction here is pretty vague.
> >>>> I run CI testing with kmemleak enabled, and have not seen
> >>>> any leaks on recent kernels when running the git regression
> >>>> tests, which are similar to this kind of workload.
> >>>> 
> >>>> Can you try to narrow the reproducer for us, even just a
> >>>> little? What client action exactly is triggering the memory
> >>>> leak? Is there any other workload on your NFS server that
> >>>> might be consuming memory?
> >>>> 
> >>>> 
> >>>>> On my setup it takes 20-40 hours until the memory is full and oom-kill gets hired by nfsd to kill other processes. the memory stays full and the system reboots:
> >>>>> 
> >>>>> [121969.590000] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,task=dbus-daemon,pid=454,uid=101
> >>>>> [121969.600000] Out of memory: Killed process 454 (dbus-daemon) total-vm:6196kB, anon-rss:128kB, file-rss:1408kB, shmem-rss:0kB, UID:101 pgtables:12kB oom_score_adj:-900
> >>>>> [121971.700000] oom_reaper: reaped process 454 (dbus-daemon), now anon-rss:0kB, file-rss:64kB, shmem-rss:0kB
> >>>>> [121971.920000] nfsd invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=0
> >>>>> [121971.930000] CPU: 1 PID: 537 Comm: nfsd Not tainted 6.8.1+nas5xx #nas5xx
> >>>>> [121971.930000] Hardware name: Freescale LS1024A
> >>>>> [121971.940000]  unwind_backtrace from show_stack+0xb/0xc
> >>>>> [121971.940000]  show_stack from dump_stack_lvl+0x2b/0x34
> >>>>> [121971.950000]  dump_stack_lvl from dump_header+0x35/0x212
> >>>>> [121971.950000]  dump_header from out_of_memory+0x317/0x34c
> >>>>> [121971.960000]  out_of_memory from __alloc_pages+0x8e7/0xbb0
> >>>>> [121971.970000]  __alloc_pages from __alloc_pages_bulk+0x26d/0x3d8
> >>>>> [121971.970000]  __alloc_pages_bulk from svc_recv+0x9d/0x7d4
> >>>>> [121971.980000]  svc_recv from nfsd+0x7d/0xd4
> >>>>> [121971.980000]  nfsd from kthread+0xb9/0xcc
> >>>>> [121971.990000]  kthread from ret_from_fork+0x11/0x1c
> >>>>> [121971.990000] Exception stack(0xc2cadfb0 to 0xc2cadff8)
> >>>>> [121971.990000] dfa0:                                     00000000 00000000 00000000 00000000
> >>>>> [121972.000000] dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> >>>>> [121972.010000] dfe0: 00000000 00000000 00000000 00000000 00000013 00000000
> >>>>> [121972.020000] Mem-Info:
> >>>>> [121972.020000] active_anon:101 inactive_anon:127 isolated_anon:29
> >>>>> [121972.020000]  active_file:1200 inactive_file:1204 isolated_file:98
> >>>>> [121972.020000]  unevictable:394 dirty:296 writeback:17
> >>>>> [121972.020000]  slab_reclaimable:13680 slab_unreclaimable:4350
> >>>>> [121972.020000]  mapped:637 shmem:4 pagetables:414
> >>>>> [121972.020000]  sec_pagetables:0 bounce:0
> >>>>> [121972.020000]  kernel_misc_reclaimable:0
> >>>>> [121972.020000]  free:7279 free_pcp:184 free_cma:1094
> >>>>> [121972.060000] Node 0 active_anon:404kB inactive_anon:508kB active_file:4736kB inactive_file:4884kB unevictable:1576kB isolated(anon):116kB isolated(file):388kB mapped:2548kB dirty:1184kB writeback:68kB shmem:16kB writeback_tmp:0kB kernel_stack:1088kB pagetables:1656kB sec_pagetables:0kB all_unreclaimable? no
> >>>>> [121972.090000] Normal free:29116kB boost:18432kB min:26624kB low:28672kB high:30720kB reserved_highatomic:0KB active_anon:404kB inactive_anon:712kB active_file:4788kB inactive_file:4752kB unevictable:1576kB writepending:1252kB present:1048576kB managed:1011988kB mlocked:1576kB bounce:0kB free_pcp:736kB local_pcp:236kB free_cma:4376kB
> >>>>> [121972.120000] lowmem_reserve[]: 0 0
> >>>>> [121972.120000] Normal: 2137*4kB (UEC) 1173*8kB (UEC) 529*16kB (UEC) 19*32kB (UC) 7*64kB (C) 5*128kB (C) 2*256kB (C) 1*512kB (C) 0*1024kB 0*2048kB 0*4096kB = 29116kB
> >>>>> [121972.140000] 2991 total pagecache pages
> >>>>> [121972.140000] 166 pages in swap cache
> >>>>> [121972.140000] Free swap  = 93424kB
> >>>>> [121972.150000] Total swap = 102396kB
> >>>>> [121972.150000] 262144 pages RAM
> >>>>> [121972.150000] 0 pages HighMem/MovableOnly
> >>>>> [121972.160000] 9147 pages reserved
> >>>>> [121972.160000] 4096 pages cma reserved
> >>>>> [121972.160000] Unreclaimable slab info:
> >>>>> [121972.170000] Name                      Used          Total
> >>>>> [121972.170000] bio-88                    64KB         64KB
> >>>>> [121972.180000] TCPv6                     61KB         61KB
> >>>>> [121972.180000] bio-76                    16KB         16KB
> >>>>> [121972.190000] bio-188                   11KB         11KB
> >>>>> [121972.190000] nfs_read_data             22KB         22KB
> >>>>> [121972.200000] kioctx                    15KB         15KB
> >>>>> [121972.200000] posix_timers_cache          7KB          7KB
> >>>>> [121972.210000] UDP                       63KB         63KB
> >>>>> [121972.220000] tw_sock_TCP                3KB          3KB
> >>>>> [121972.220000] request_sock_TCP           3KB          3KB
> >>>>> [121972.230000] TCP                       62KB         62KB
> >>>>> [121972.230000] bio-168                    7KB          7KB
> >>>>> [121972.240000] ep_head                    8KB          8KB
> >>>>> [121972.240000] request_queue             15KB         15KB
> >>>>> [121972.250000] bio-124                   18KB         40KB
> >>>>> [121972.250000] biovec-max               264KB        264KB
> >>>>> [121972.260000] biovec-128                63KB         63KB
> >>>>> [121972.260000] biovec-64                157KB        157KB
> >>>>> [121972.270000] skbuff_small_head         94KB         94KB
> >>>>> [121972.270000] skbuff_fclone_cache         55KB         63KB
> >>>>> [121972.280000] skbuff_head_cache         59KB         59KB
> >>>>> [121972.280000] fsnotify_mark_connector         16KB         28KB
> >>>>> [121972.290000] sigqueue                  19KB         31KB
> >>>>> [121972.300000] shmem_inode_cache       1622KB       1662KB
> >>>>> [121972.300000] kernfs_iattrs_cache         15KB         15KB
> >>>>> [121972.310000] kernfs_node_cache       2107KB       2138KB
> >>>>> [121972.310000] filp                     259KB        315KB
> >>>>> [121972.320000] net_namespace             30KB         30KB
> >>>>> [121972.320000] uts_namespace             15KB         15KB
> >>>>> [121972.330000] vma_lock                 143KB        179KB
> >>>>> [121972.330000] vm_area_struct           459KB        553KB
> >>>>> [121972.340000] sighand_cache            191KB        220KB
> >>>>> [121972.340000] task_struct              378KB        446KB
> >>>>> [121972.350000] anon_vma_chain           753KB        804KB
> >>>>> [121972.360000] anon_vma                 170KB        207KB
> >>>>> [121972.360000] trace_event_file          83KB         83KB
> >>>>> [121972.370000] mm_struct                157KB        173KB
> >>>>> [121972.370000] vmap_area                217KB        354KB
> >>>>> [121972.380000] kmalloc-8k               224KB        224KB
> >>>>> [121972.380000] kmalloc-4k               860KB        992KB
> >>>>> [121972.390000] kmalloc-2k               352KB        352KB
> >>>>> [121972.390000] kmalloc-1k               563KB        576KB
> >>>>> [121972.400000] kmalloc-512              936KB        936KB
> >>>>> [121972.400000] kmalloc-256              196KB        240KB
> >>>>> [121972.410000] kmalloc-192              160KB        169KB
> >>>>> [121972.410000] kmalloc-128              546KB        764KB
> >>>>> [121972.420000] kmalloc-64              1213KB       1288KB
> >>>>> [121972.420000] kmem_cache_node           12KB         12KB
> >>>>> [121972.430000] kmem_cache                16KB         16KB
> >>>>> [121972.440000] Tasks state (memory values in pages):
> >>>>> [121972.440000] [  pid  ]   uid  tgid total_vm      rss rss_anon rss_file rss_shmem pgtables_bytes swapents oom_score_adj name
> >>>>> [121972.450000] [    209]     0   209     5140      320        0      320         0    16384      480         -1000 systemd-udevd
> >>>>> [121972.460000] [    230]   998   230     2887       55       32       23         0    18432        0             0 systemd-network
> >>>>> [121972.470000] [    420]     0   420      596        0        0        0         0     6144       22             0 mdadm
> >>>>> [121972.490000] [    421]   102   421     1393       56       32       24         0    10240        0             0 rpcbind
> >>>>> [121972.500000] [    429]   996   429     3695       17        0       17         0    20480        0             0 systemd-resolve
> >>>>> [121972.510000] [    433]     0   433      494       51        0       51         0     8192        0             0 rpc.idmapd
> >>>>> [121972.520000] [    434]     0   434      743       92       33       59         0     8192        7             0 nfsdcld
> >>>>> [121972.530000] [    451]     0   451      390        0        0        0         0     6144        0             0 acpid
> >>>>> [121972.540000] [    453]   105   453     1380       50       32       18         0    10240       18             0 avahi-daemon
> >>>>> [121972.550000] [    454]   101   454     1549       16        0       16         0    12288       32          -900 dbus-daemon
> >>>>> [121972.560000] [    466]     0   466     3771       60        0       60         0    14336        0             0 irqbalance
> >>>>> [121972.570000] [    475]     0   475     6269       32       32        0         0    18432        0             0 rsyslogd
> >>>>> [121972.590000] [    487]   105   487     1347       68       38       30         0    10240        0             0 avahi-daemon
> >>>>> [121972.600000] [    492]     0   492     1765        0        0        0         0    12288        0             0 cron
> >>>>> [121972.610000] [    493]     0   493     2593        0        0        0         0    16384        0             0 wpa_supplicant
> >>>>> [121972.620000] [    494]     0   494      607        0        0        0         0     8192       32             0 atd
> >>>>> [121972.630000] [    506]     0   506     1065       25        0       25         0    10240        0             0 rpc.mountd
> >>>>> [121972.640000] [    514]   103   514      809       25        0       25         0     8192        0             0 rpc.statd
> >>>>> [121972.650000] [    522]     0   522      999       31        0       31         0    10240        0             0 agetty
> >>>>> [121972.660000] [    524]     0   524     1540       28        0       28         0    12288        0             0 agetty
> >>>>> [121972.670000] [    525]     0   525     9098       56       32       24         0    34816        0             0 unattended-upgr
> >>>>> [121972.690000] [    526]     0   526     2621      320        0      320         0    14336      192         -1000 sshd
> >>>>> [121972.700000] [    539]     0   539      849       32       32        0         0     8192        0             0 in.tftpd
> >>>>> [121972.710000] [    544]   113   544     4361        6        6        0         0    16384       25             0 chronyd
> >>>>> [121972.720000] [    546]     0   546    16816       62       32       30         0    45056        0             0 winbindd
> >>>>> [121972.730000] [    552]     0   552    16905       59       32       27         0    45056        3             0 winbindd
> >>>>> [121972.740000] [    559]     0   559    17849       94       32       30        32    49152        4             0 smbd
> >>>>> [121972.750000] [    572]     0   572    17409       40       16       24         0    43008       11             0 smbd-notifyd
> >>>>> [121972.760000] [    573]     0   573    17412       16       16        0         0    43008       24             0 cleanupd
> >>>>> [121972.770000] [    584]     0   584     3036       20        0       20         0    16384        4             0 sshd
> >>>>> [121972.780000] [    589]     0   589    16816       32        2       30         0    40960       21             0 winbindd
> >>>>> [121972.790000] [    590]     0   590    27009       47       23       24         0    65536       21             0 smbd
> >>>>> [121972.810000] [    597]   501   597     3344       91       32       59         0    20480        0           100 systemd
> >>>>> [121972.820000] [    653]   501   653     3036        0        0        0         0    16384       33             0 sshd
> >>>>> [121972.830000] [    656]   501   656     1938       93       32       61         0    12288        9             0 bash
> >>>>> [121972.840000] [    704]     0   704      395      352       64      288         0     6144        0         -1000 watchdog
> >>>>> [121972.850000] [    738]   501   738     2834       12        0       12         0    16384        6             0 top
> >>>>> [121972.860000] [   4750]     0  4750     4218       44       26       18         0    18432       11             0 proftpd
> >>>>> [121972.870000] [   4768]     0  4768      401       31        0       31         0     6144        0             0 apt.systemd.dai
> >>>>> [121972.880000] [   4772]     0  4772      401       31        0       31         0     6144        0             0 apt.systemd.dai
> >>>>> [121972.890000] [   4778]     0  4778    13556       54        0       54         0    59392       26             0 apt-get
> >>>>> [121972.900000] Out of memory and no killable processes...
> >>>>> [121972.910000] Kernel panic - not syncing: System is deadlocked on memory
> >>>>> [121972.920000] CPU: 1 PID: 537 Comm: nfsd Not tainted 6.8.1+nas5xx #nas5xx
> >>>>> [121972.920000] Hardware name: Freescale LS1024A
> >>>>> [121972.930000]  unwind_backtrace from show_stack+0xb/0xc
> >>>>> [121972.930000]  show_stack from dump_stack_lvl+0x2b/0x34
> >>>>> [121972.940000]  dump_stack_lvl from panic+0xbf/0x264
> >>>>> [121972.940000]  panic from out_of_memory+0x33f/0x34c
> >>>>> [121972.950000]  out_of_memory from __alloc_pages+0x8e7/0xbb0
> >>>>> [121972.950000]  __alloc_pages from __alloc_pages_bulk+0x26d/0x3d8
> >>>>> [121972.960000]  __alloc_pages_bulk from svc_recv+0x9d/0x7d4
> >>>>> [121972.960000]  svc_recv from nfsd+0x7d/0xd4
> >>>>> [121972.970000]  nfsd from kthread+0xb9/0xcc
> >>>>> [121972.970000]  kthread from ret_from_fork+0x11/0x1c
> >>>>> [121972.980000] Exception stack(0xc2cadfb0 to 0xc2cadff8)
> >>>>> [121972.980000] dfa0:                                     00000000 00000000 00000000 00000000
> >>>>> [121972.990000] dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> >>>>> [121973.000000] dfe0: 00000000 00000000 00000000 00000000 00000013 00000000
> >>>>> [121973.010000] CPU0: stopping
> >>>>> [121973.010000] CPU: 0 PID: 540 Comm: nfsd Not tainted 6.8.1+nas5xx #nas5xx
> >>>>> [121973.010000] Hardware name: Freescale LS1024A
> >>>>> [121973.010000]  unwind_backtrace from show_stack+0xb/0xc
> >>>>> [121973.010000]  show_stack from dump_stack_lvl+0x2b/0x34
> >>>>> [121973.010000]  dump_stack_lvl from do_handle_IPI+0x151/0x178
> >>>>> [121973.010000]  do_handle_IPI from ipi_handler+0x13/0x18
> >>>>> [121973.010000]  ipi_handler from handle_percpu_devid_irq+0x55/0x144
> >>>>> [121973.010000]  handle_percpu_devid_irq from generic_handle_domain_irq+0x17/0x20
> >>>>> [121973.010000]  generic_handle_domain_irq from gic_handle_irq+0x5f/0x70
> >>>>> [121973.010000]  gic_handle_irq from generic_handle_arch_irq+0x27/0x34
> >>>>> [121973.010000]  generic_handle_arch_irq from call_with_stack+0xd/0x10
> >>>>> [121973.010000] Rebooting in 90 seconds..
> >>>> 
> >>>> --
> >>>> Chuck Lever
> >>>> 
> >>>> 
> >> 
> >> --
> >> Chuck Lever
> >> 
> >> 
> 
> --
> Chuck Lever
> 
>





[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux